Friday, December 22, 2006

IBM Software TTS On Ubuntu 6

I had earlier reported the IBM TTS engine segfaulting on my home Ubuntu 6 machine; I'm still to fix that problem. Surprizingly though, I tried it on a different Ubuntu 6 machine, and there, installing

sudo apt-get install libstdc++2.10-glibc2.2
on top of the regular Ubuntu install was enough to get the TTS working. The segfault on my home machine occurs inside libc, and as best I can tell the two machines have identical libc installs. So on the positive front, it's only my home machine that appears broken; on the negative side, the fact that the breakage is hard to explain and hence fix doesn't inspire confidence in upgrading. Others have reported not having any problems on Edgy -- so the problem is hopefully transitory.

Thursday, December 14, 2006

Web Command Line Tool For Google Patent Search

From the every new Google search tool gets a corresponding Emacspeak wizard dept:

Google just introduced Patent Search an easy to use search interface for US patents. In the spirit of the Web Command Line, Emacspeak now sports a Patent Search From Google smart URL like other smart URLs on the Emacspeak Audio Desktop, this places executing patent searches just a few keystrokes away. To use this new wizard, do the following:

  • Update your working copy of Emacspeak from SVN. The patent search wizard will be part of the next official Emacspeak release slated for May 2007.
  • Invoke the Smart URL tool by pressing C-e u
  • Type the first few characters of the phrase patent search from Google and hit tab to complete the name of the smart URL.
  • Hit enter, specify what you're searching for and hit enter again.

happy inventing ---

Friday, November 24, 2006

Emacspeak 25.0 (ActiveDog) Unleashed

I released Emacspeak 25.0 (ActiveDog) yesterday. Here are the release notes.

While trying to release it on Sourceforge I ran into a couple of show-stoppers --- first off the SourceForge download mechanism has changed yet again and it's now gotten sufficiently convoluted that software released via SourceForge's file release system is becoming well-nigh impossible to download except for the most motivated user. Worse, as I released the file, upload services on SourceForge hung, with a consequence that it created a 0 sized emacspeak-25.tar.bz2 file --- the SourceForge mechanism made it well-nigh impossible to clean up the 0-sized file.

Given this mess, I decided to keep things simple by placing the release on the Emacspeak site directly -- this suffers from the disadvantage of the release not getting mirrored on the various Sourceforge download archives; it has the advantage of allowing users to grab the release with one click.

The other thing to note about this release is that there are no RPMs built --- a first in 7 years. I've now switched to Ubuntu on my home machine and dont have the ability to build RPMs so I'll leave this to individual distributions. I've also not built Debian packages this time around, mostly because I've not gotten around to understanding Debian's packaging system sufficiently well to do this. Moreover I prefer minimalistic packaging solutions --- and in general though Debian's packaging is nice, it still feels a little too heavy-weight for something like the Emacspeak source tarball which needs a simple make; sudo make install to get set up. So for now, I'll rely on folks like Jim Van Zandt to build downstream Debian packages for incorporating with the various distributions like Debian and Ubuntu --- this way, packages can be built to match particular distributions.

Thursday, October 05, 2006

Emacspeak Smart URL For Google Code Search

Every useful Web tool deserves a Web short-cut. Google just announced a new service called Code Search that lets you search the codebases of Open Source projects. It's a great way of finding source code relevant to ones programming projects.

I just checked in a corresponding URL Template --- as a reminder, url templates in Emacspeak are smart URLs that provide web short-cuts to create a conceptual Smart Web Command Line. To use:

  • Invoke the url templatetool via the default key-binding C-e u.
  • Select the Code Search Via Google tool by typing cod and pressing the tab key.
  • Specify the search query --- see the Google Code Search help instructions for the types of queries that are allowed.

Thursday, September 28, 2006

Using Helix Player From Emacspeak

In the spirit of You can never have sufficiently many media players, I now have Helix Player working under Emacspeak i.e., I can now run Helix Player without having to start up X. This is useful because there are still media streams on the Web that sometimes fail with mplayer, and from the minimal testing I've done so far, Helix Player is successful in those cases.

What Is It?

HelixPlayer --- installable on modern Linux distributions as hxplay from package HelixPlayer is the community-supported version of RealPlayer 10. The well-distributed and documented client, hxplay is capable of playing a wide variety of audio and video formats over HTTP and RTSP/RTP, and specifically, can handle RealPlayer10 formats which includes support for 5.1 audio.

A lesser known set of tools available from Helix --- Helix DNA Client is a bare-bones UI-less player which can be used effectively at the shell. You can download pre-built binaries for your flavor of Linux (GCC3.2 or later vs GCC 2.95 based systems) note that these are nightly builds. You can also download a source zip archive. Note that all of these requires you to accept a End Users License Agreement (EULA) before being taken to the download link.

The links on the page above can be confusing; Here are pointers to the specific packages you need to grab if you want a player that has all of the functionality described above.

Sep 26, 2006 build for Linux GCC 3.2
Source archive from September 27, 2006

Using The Binary Distribution

Here is what I did you set up the binary distribution on my Ubuntu 6.0.6 (Dapper) machine

  • Unpacked binary package under /usr/lib.
  • Created a symlink /usr/lib/splay to point to the directory created by unpacking the binary package.
  • Created the following shell script /usr/bin/hsplay to launch the player:
    #Use Simple Helix Player:
    exec /usr/bin/aoss $SPLAY_LIB/splay -iss -s "$@"
  • The above script assumes you have the alsa-oss package nstalled; you will need this to have Helix Player use ALSA --- something that is essential if you want to be able to use your sound card with other applications while playing media streams.

With this setup, you can launch one or more media streams (both local, as well as remote HTTP/RTSP/RTP streams) from a shell. This player successfully plays the BBC Radio4 LW stream, something mplayer fails to play on my Ubuntu box.

Tuesday, September 12, 2006

Emacspeak, Ubuntu And Software Dectalk

On the positive side with respect to software synthesis, the Software Dectalk does work out of the box on Ubuntu --- out of the box that is if you first install alsa-oss the ALSA->OSS compatibility layer. I've updated the Emacspeak speech server for Software Dectalk to use alsa-oss where available; performance is not as responsive as the Emacspeak Viavoice server using the native ALSA APIs, but it's a good backup option.

Monday, September 11, 2006

Emacspeak 24 On Ubuntu 6

I upgraded my home FC3 machine to Ubuntu 6.0.6 (Dapper) over the weekend. Here is a short summary for things to watch out for as an emacspeak user.

The Good, The Bad, And The Painful

One of my friends helped with the install and it is remarkably quick when everything works (in my case the Ubuntu LTS 6.0.6 installer had trouble with the NVidea display card and came up correctly at the third attempt).
A one CD install is nice -- but after it you have remarkably little installed from the perspective of an emacspeak user. You end up with a very nice GUI but very little else --- the reasoning being that the average user wont need much more, and the savvy user can always run apt-get.
Worse, Ubuntu does not install openssh-server --- it limits itself to installing openssh-client. This means that you cannot bootstrap yourself by logging in from another machine until you install openssh-server off the network. If there was one thing I would ask the Ubuntu maintainers, it would be to rectify this situation.
In my case, the apt suite of tools appeared to have a problem --- they died saying /var/lib/dpkg/available: no such file or directory. Googling showed this to be a known problem with apt and the fix is to run dselect update -- but if you're new to Debian/ubuntu, this is less than obvious.
Once you overcome the above, apt-get got me emacspeak-17.0 which was sufficient to let me bootstrap the rest of the process on my own using my trusted Dectalk Express to produce speech.
Note that you should install tcl8.3 and tclx8.3 --- rather than the newest (8.4) versions of these packages. This is because as of 8.4, the maintainers of those packages no longer build a stand-alone tcl (extended TCL) shell. This is something that will have to be handled by Emacspeak in the future.
I was able to get everything I needed (and more) installed using a combination of apt-get and aptitude.
The IBM TTS engine no longer works --- under FC3 and friends, you needed to install package libstdc++-compat to get it to work. Well, there is no corresponding package for Ubuntu/Debian from what I could find out, and pulling in the RPM for libstdc++-compat, converting it via alien and installing the result produces a segfault when you run the TTS engine.
For the same reason, the old command-line trplayer will also not work on Ubuntu 6.0. This is not as painful --- since mplayer works --- though I had to build mplayer from source. It would be nice to create a command-line player on top of the HelixPlayer code base. At present, the missing trplayer means that the etc/ provided by emacspeak no longer works. You can use mplayer to convert realaudio to mp3; however mplayerdoes not have a command-line option to specify the duration of playback, something that script etc/ needs.

Thursday, September 07, 2006

Google Archive News Search

To mark the arrival of Google News Archive search, I checked in a Archive News Search url-template yesterday morning. To use it, hit C-e u followed by arc tab and specify your search term.

The above is checked into the SVN repository at Emacspeak GoogleCode.

Friday, August 25, 2006

Update: Emacspeak On Google Code Hosting

The initial experiment of moving emacspeak development to Subversion at Google Code Hosting has been largely successful. After a few bumps along the road, mostly a consequence of my being new to SVN, things are looking good, and I have stopped updating the CVS repository on SourceForge.

Some additional goodies as a consequence of the move to SVN:

  • SVN Tags contains snapshots of prior releases.
  • Future releases will come with an SVN Revision number that allows one to reliably recreate a released version.

Sunday, August 13, 2006

Emacspeak Codebase Via Subversion From GoogleCode

I've checked in the Emacspeak codebase into the Subversion repository provided by Google Project Hosting . The project page is Emacspeak at GoogleCode. You can find Emacspeak --- complete with its code history going back to the point where I started using CVS at Emacspeak SVN Repository.

For now, the Emacspeak Web site will continue to live at Sourceforge; The Emacspeak mailing list will continue to live at Vassar as before. To checkout the code from SVN, follow the instructions on Emacspeak SVN. If you run into any hitches in checking out the code, please report it on the Emacspeak mailing list. Note that you can anonymous checkout the code from the above location entirely from the shell command-line without ever having to point a browser at anything.

Emacspeak users presently running out of SourceForge CVS might want to do an SVN checkout in a separate directory and make sure things work, in preparation for a permanent switch-over to svn. Here are the minimal steps you need to perform:

  • svn checkout emacspeak
  • The above will create a directory called emacspeak with the code under it; obviously, you should do this somewhere different from where you have your current copy of emacspeak.
  • For now, I recommend renaming the directory created in the above step to svn-emacspeak so that you can easily tell which snapshot you're looking at.

Note that reading these is not a replacement for learning about SVN --- there is an excellent on-line book available at SVN Manual.

Thursday, August 10, 2006

Zipping Through Web Pages

Zipping Through Web Pages With Emacspeak/W3

I just added an experimental zip through Web pages shortcut to emacspeak-w3. The command is called emacspeak-w3-speak-next-block and is bound to z in all W3 buffers. It is useful for quickly moving through Web pages that have logically separate content units in separate blocks where a block is one of:

  • HTML div.
  • HTML tables.
  • HTML p elements.

In general this provides an effective means of skimming many large Web pages.

Sunday, July 23, 2006

Summary Of Emacspeak Features Compared To Other Alternatives


I've received a number of queries asking about the differences between Emacspeak and Speechdel ---especially given some of the somewhat confusing assertions made in recent Speechdel release announcements. I'm posting this article in the hope of clearing up some of this confusion.

1 Emacspeak And Speechdel -*- mode:org -*-

1.1 Background

Emacspeak speech-enables Emacs by advising core emacs functionality. Speech services are provided by a simple Emacspeak speech-server. Additionally, Emacspeak implements speech-extensions for popular emacs modules --- see the speech-enabled applications list.

Emacspeak was first released in 1995, and then (as now), there was limited speech access to the Linux GUI. Therefore, to be useful as a complete access solution, Emacspeak has always needed to enable the user to do everything from within Emacs, not just regular editing operations. As a case in point, emacspeak users are probably one of the last remaining communities that use Emacs for browsing the Web.

1.2 Speech Dispatcher (SpeechD)

The idea of SpeechD --- peech Dispatcher as an intermediate layer between speech clients and TTS engines was first floated sometime in the late 90's. Such a common layer is a laudible goal but is something that takes time and effort to get right. Additionally, you have the challenge of geting existing software e.g., emacspeak, to abandon their own speech abstraction and re-implement against a supposedly more generic, but completely untested and untried intermediate layer.

The developers of SpeechD initially incorporated some of the Emacspeak code into an Emacs wrapper (speechdel) that called SpeechD, but later decided to go their own way -- and present speechdel is the result.

Like Emacspeak, speechdel uses Emacs Lisp's advice facility to add spoken feedback to core editing commands; speech output is produced by calling out to speech-dispatcher.

The summary of feature differences between Emacspeak and speechdel in the next section is from examining the speechdel code-base; I have not run speechdel since its dependency chain resulting from speechd was difficult to resolve on my FC3 64bit machine.

1.3 Emacspeak Features Not Found In SpeechDel

  1. Emacspeak implements Aural CSS ACSS, and uses it to provides the aural analog of font-lock.
  2. Emacspeak provides pronunciation dictionaries. Pronunciations can be defined on a per-mode, per-buffer or per-directory basis. Directory and mode specific pronunciations are persisted across sessions. This allows Emacspeak to leverage Emacs' intelligence about the semantics of a given application; thus, you can have it say "p arrow x" for "p->x" when editing C code. Per-directory pronunciations are useful for reading electronic books. Per-buffer pronunciations are useful for succinctly speaking long lines of shell output e.g. when compiling complex software.
  3. By advising core Emacs functionality, Emacs modes work out of the box with Emacspeak. But in most cases, Emacspeak goes one step further by providing light-weight speech-modules that specialize spoken output for a given mode. As an example, advising next-line to speak the current line is sufficient to use dired-mode --- but having to listen to the entire line of dired output is not a pleasant experience. The dired-specific module in Emacspeak advises all interactive dired commands to speak the "right" information. As an another example, GUD interaction automatically speaks the line of source-code without leaving the Gud buffer.
  4. Emacspeak comes with many "Emacs Applets" for performing tasks that most users would perform outside of Emacs. Examples include playing CDs, playing multimedia streams etc. Fortunately, I have not had to write too many of these since there are always Emacs users other than myself who also create such Emacs applications --- so where Emacs applications already exist, I merely speech-enable them with a small set of advice definitions, and in some cases add a few additional interactive commands.
  5. Emacs applications are plentiful for most tasks; one exception is the Web. Since emacs/w3 development was abandoned sometime around 1998, I have added significant Web interaction functionality to Emacspeak using Emacs/W3 as the basis. Today, a lot of this has also been ported to Emacs/W3M thanks to other enthusiasts on the Emacspeak mailing list. Examples include:
    1. WebSearch module --- prompts for query and processes response to focus on the results.
    2. XSLT pre-processing: Allows pre-processing of complex pages before rendering via W3. Used to enable smart screen-scrapers using XPath.
    3. URL-Tempaltes: Originally motivated by webjump.el, this provides url templates that enable easy access to a variety of Web tasks rangig from looking up flight times to listening to your favorite NPR or BBc show. Think early cut at a "Web Command Line in the minibuffer".
  6. Customization via Custom, including additional keymaps. Comes with additional keymap files for the Linux console to enable hyper, super, and alt prefix keymaps.
  7. Module emacspeak-wizards iplements a large collection of Emacs wizards that enable common tasks that you would otherwise perform at the shell e.g., checking display status on a laptop. The additional prefix keymaps come in handy!
8)Finally, note that all modules (except the core) are loaded on demand.All code is compiled with byte-compile-dynamic set to =T= and individual application-specific modules are kept completely independent of one another. Given the size of the Emacspeak codebase, this is a pre-requisite for both efficiency and developer sanity.

Author: TV Raman <>

Thursday, July 20, 2006

Emacspeak And Accessible Search Via Google

Google has released an early experiment that favours easy to read Web content --- checkout the relevant Blog post here. Emacspeak has always had a set of Google Websearch tools --- and this set has now been enhanced with a shortcut to Accessible Search. Below, I'll summarize the set of Google Websearch tools in Emacspeak.

All Emacspeak Websearch tools are reached via the key-sequence C-e?. Specific search tools are selected by single-letter keystrokes following C-e? --- I'll enumerate some of these below.

Accessible Search --- Google Web Search that favors accessible content.
Vanila Google Search.
Google I'm Feeling Lucky --- takes you directly to the first search hit.
Google News Search.
EmapSpeak Via Google Maps.
Google Usenet Search.

Note that in addition, module emacspeak-url-template provides a number of Google tools as smart URLs.

Friday, June 23, 2006

ALSA And Emacspeak: Closing The Legacy Loop With ALSA-OSS

And now, with ALSA working well with software TTS and cooperating with ALSA_aware streaming applications such as mplayer it's time to close the legacy loop for those few applications that still have the old OSS API hard-wired.

One such useful application is trplayer --- the command-line real player that has not been updated in over 4 years. For the most part, the functionality provided by trplayer is subsumed by the newer --- and actively maintained --- mplayer but it's still useful to have trplayer for times when mplayer hits gliches with slow-responding RTSP streams.

The ALSA way of handling such legacy applications is through the ALSA OSS emulation layer; Emacspeak now contains a script etc/atrplayer that invokes trplayer via aoss. Incidentally for the more observant Emacspeak user running out of CVS, script atrplayer is not new; it has been around for about a year, but until now it used command vsound to stream the converted audio to command aplay. I needed to do this until ALSA 1.0.11 since trplayer used to fail sporadically if run through the AOSS emulation layer. But those problems now seem to be in the past with the upgrade to ALSA 1.0.11. As usual with cutting edge technology like ALSA, your mileage with all of this will vary; so let me end with the usual disclaimer --- if it breaks, you get to keep both pieces.

Monday, June 19, 2006

SpeakFreely, Software TTS And ALSA

In ALSA and ASYM I mentioned that speakfreely appeared to have stopped working. As it turns out, this had nothing to do with the switch to ASYM. I believe that in the past I had run speakfreely by first killing software TTS --- since by default speakfreely uses OSS.

Getting speakfreely working with ALSA without losing software TTS required the following steps;

  1. Retrieve the latest tarball speak_freely-7.6a.tar.gz
  2. Uncomment the ALSA specific line in its Makefile
  3. In file audio_alsa.c, change the default audio device from plughw:0,0 to default. Without this change, speakfreely will try to access the sound card directly; setting it to default on line 41:
    char *devAudioOutput = "default";
    matches things up with the pcm.default that was configured in the .asoundrc.

With this, you can now talk using speakfreely and continue to use software TTS.

Tuesday, June 06, 2006

Emacspeak, TTS, Alsa And ASYM

This is a continuation of the earlier thread about using ALSA for software TTS using DMix --- see ASoundrc And Emacspeak.

I've now updated the CVS version of ASoundRC to use the ASYM plugin for the default ALSA device. The ASYM plugin allows you to configure both the playback and capture device, which removes the annoyance of having to specify an ALSA device when calling arecord --- as used to be the case when using DMIX in the pcm.default device.

Possible Caveats: I am having trouble getting speakfreely to work reliably --- I've used it with ALSA in the past --- though I'm not sure if the ASYM plugin is the culprit.

Monday, May 22, 2006

ASoundrc Parameters For Reliably Using ALSA Powered Software TTS

Advanced Linux Sound Architecture ALSA is a boon for software TTS users --- you can now use your soundcard to produce spoken output while not losing audio output from other applications such as music players and streaming radio stations.

Emacspeak implements an ALSA-enabled TTS server for the IBM ViaVoice engine --- using this server effectively requires appropriately tuning the parameters in the user's asoundrc file to:

  • Enable the DMix plugin to enable software mixing of multiple channels of audio.
  • To configure the various parameters ALSA itself uses.

Depending on how well your sound-card is supported by ALSA, the above can be either trivially simple or a tedious process of trial and error. I'm writing this up to:

  • Collect a list of sound cards on which the asoundrc provided with Emacspeak works as expected.
  • In the hope that the wider ALSA community discovers and helps flesh out this material; my hope is that the ALSA community has more insight into how these settings work.

For the above, works effectively means the following:

  • The TTS engine speaks without perceptible stuttering or other audio artifacts.
  • The engine is responsive with respect to starting and stopping speech; especially when typing fast at high speech rates.
  • The TTS engine does not interfere with other alsa-enabled applications, e.g. mplayer.

At the end of this entry, you can find the relevant section from the asoundrc file from the Emacspeak distribution, with comments indicating which sound cards perform well. An example of a card that does not work well with these settings is the Audigy-LS from Creative; the TTS engine works on that card, but performs degrades:

  • mplayer cannot use the audio device; (aplay and mpg321 are able to share the card with the TTS engine.)
  • Speech does not stop immediately as on the soundcards enumerated in the asoundrc file.
  Id: asoundrc,v 1.3 2006/05/23 00:22:16 raman Exp $
#these numbers work on the following:
# aplay -l | head 1
# I82801DBICH4 [Intel 82801DB-ICH4] (IBM Thinkpads)
# ICH6 [Intel ICH6],

#  default device is a mixer

pcm.!default {
    type plug
    slave.pcm "dmixer"

pcm.dmixer  {
    type dmix
    ipc_key 1024
    slave {
        pcm "hw:0,0"
        format s16_LE
        period_time 0
        period_size 1024
        buffer_size 4096
        rate 44100
    bindings {
        0 0
        1 1

Wednesday, May 03, 2006

Listening To The Web Through A Mobile Lens

The similarities between Web access issues faced by mobile users and those confronting eyes-free Web browsing are striking, and these similarities have often been used to advocate the creation of well-structured, accessible Web content. As an example of mobile-friendly content being a blessing for eyes-free spoken access to WebFormation, Emacspeak provides a mobile lens via the Google Mobile transcoder.

Here are a few convenient means of using the above within the Emacspeak Audio Desktop:

  • While browsing the Web using w3, press t on a link (command: emacspeak-w3-transcode-via-google) to view that link through the mobile transcoder.
  • Note that all links in the resulting mobile view automatically go through the transcoder.
  • To undo the effect of automatically viewing links in the mobile view through the transcoder, use t with a interactive prefix argument i.e., press C-u t to follow a link to view it in its original form.
  • Additionally, I bind command emacspeak-wizards-google-transcode to a convenient key so that I can launch Web sites using the mobile view.

I use this tool on a regular basis while commuting to work to browse mainstream news sites, it provides speech-friendly content that has the added benefit of downloading fast over a wireless link --- after all, this is Mobile content.

Tuesday, May 02, 2006

Announcing Emacspeak 24.0 (LiveDog)

For Immediate Release

San Jose, CA, (May 3, 2006)
Emacspeak-Alive: --- Bringing Live Access For Enlightened Users
--Zero cost of ownership makes priceless software affordable!

Major Enhancements

  1. emacspeak-muse: Speech-enabled Muse Mode
  2. emacspeak-ruby: Speech-enabled Ruby Mode
  3. emacspeak-m-player: Updated for new MPlayer
  4. emacspeak-sudoku.el: Speech-enabled SuDoKu
  5. New Option: tts-strip-octals
  6. emacspeak-keymap.el Updated keybindings
  7. lisp/atom-blogger.el Light-weight blogging tool
  8. emacspeak-atom-blogger: Speech-enables above
  9. voice-setup.el Custom support
  10. Multispeech related patches
  11. User contributed patches

Friday, March 10, 2006

W3: Minor Patch To Handle Content-Type application/xhtml+xml

Here is a minor patch to w3.el to allow it to handle content-type application/xhtml+xml. For all practical purposes (at least as far as W3 is concerned), this can be handled by the html parser/renderer; however since that content-type did not exist at the time W3 was written, it offers to download/save documents of that type. The attached patch fixes this, and also adds a fix to a minor irritant with decoding of multimedia attachments.

Index: w3.el
RCS file: /cvsroot/w3/w3/lisp/w3.el,v
retrieving revision 1.32
diff -b -c -r1.32 w3.el
*** w3.el	12 Jan 2003 22:10:25 -0000	1.32
--- w3.el	11 Mar 2006 02:24:52 -0000
*** 34,39 ****
--- 34,40 ----
  (require 'w3-sysdp)
+ (eval-when-compile (require 'mm-decode))
  (require 'w3-cfg)
  (or (featurep 'efs)
*** 325,331 ****
  				  (mm-handle-media-type handle)))))
        ;; Fixme: can handle be null?
!        ((equal (mm-handle-media-type handle) "text/html")
  	;; Special case text/html if it comes through w3-fetch
  	(set-buffer (generate-new-buffer " *w3-html*"))
--- 326,333 ----
  				  (mm-handle-media-type handle)))))
        ;; Fixme: can handle be null?
!        ((or (equal (mm-handle-media-type handle) "application/xhtml+xml")
!          (equal (mm-handle-media-type handle) "text/html"))
  	;; Special case text/html if it comes through w3-fetch
  	(set-buffer (generate-new-buffer " *w3-html*"))

Wednesday, March 08, 2006

Blogging From Emacs: Additional Atom-Blogger Documentation

Thanks to Jason Dunsmore for writing up some additional step-by-step documentation on using atom-blogger.

Thursday, February 23, 2006

Emacspeak: Connecting Lynx And W3

Emacs/W3 is still the best Web page rendering option inside Emacspeak given the ability to apply XSL transforms, as well as obtaining aural styling via ACSS. However W3's url handling layer often breaks when faced with multiple redirects, especially when some of these happen through the Host: HTTP header. Additionally, HTTPS authentication sometimes fails mysteriously in the presence of redirects.

In many of these cases, lynx happily fetches the pages correctly; however you're then stuck using a fairly weak auditory interface in that Emacspeak degrades to being aterminal level screenreader.

An effective solution to this problem is to use lynx within an Emacs terminal, and after finding the content that is worth reading, handing off that content to Emacs/W3. The next few paragraphs show how.

The lynx-site.cfg File

This is where you add site-specific configurations. Here are the lines I have in my lynx-site.cfg to integrate lynx and Emacs. Before you use any of this, make sure you have executed M-x server-start in your running Emacs, and make sure that all is well by experimenting with emacsclient to ensure that external programs can hand-off editting tasks to the currently running Emacs.

#site defaults
#for bookshare:
PRINTER:Edit:emacsclient %s:TRUE
KEYMAP:???:EDITTEXTAREA	# use external editor to edit a form textarea
PRINTER:W3:emacsclient -e '(w3-open-local "%s")':TRUE

Below, I'll describe what each of the above lines do:

    The above line creates an additional item in the download menu that invokes the BookShare unpacker. Script invokes the BookShare unpack tool with the appropriate options.
  • PRINTER:Edit:emacsclient %s:TRUE
    This creates an Edit item in the print menu. Invoking this menu item causes the current page to be handed off to Emacs for editting. If you want to edit the source, first switch to source view by hitting \ before invoking print.
  • KEYMAP:???:EDITTEXTAREA # use external editor to edit a form textarea
    This sets lynx up so that when editting a multiline textarea, you can hand off the editting job to Emacs. This is particularly useful for editting Wiki pages. Replace the ?? with the desired key sequence.

    The above two settings make the edit source functionality more pleasant to use.
  • PRINTER:W3:emacsclient -e '(w3-open-local "%s")':TRUE
    The above creates a W3 menu item in the print menu. Invoking this causes Emacs/W3 to display the current page --- again switch to source view before invoking this so that Emacs/W3 gets handed the HTML markup.


#!/usr/bin/perl -w
#$Id:,v 1.1 2003/07/04 15:41:55 tvraman Exp tvraman $
#Description: Bookshare downloader for Lynx
use strict;
my $location="$ENV{HOME}/books/book-share";
my $password = 'xxxxxxx';
my $grabbed = shift;
my $target = shift;
my $dir =qx(basename $target .bks);
chomp $dir;
my $where = "$location/$dir";
qx(mkdir -p $where);
qx(mv $grabbed  $where/$target);
chdir $where;
qx(echo $password | bks-unpack -q $target 1>&- 2>&- &);

Tuesday, February 21, 2006

Emacspeak, SuDoKu And History

Here is a small enhancement to playing SuDoKu in Emacspeak. The feature is probably generally useful i.e., it's not specific to eyes-free interaction, but its presence encourages one to try different solution strategies.

Commands emacspeak-sudoku-history-push bound to m and emacspeak-sudoku-history-pop bound to M allow one to mark interesting states in the game and return to these prior states with a single keystroke. This means that when one is confronted with one of two choices, with no apparent additional information on which route to take, it becomes possible to push that state on to the history stack, try one of the alternatives and backtrack if necessary.

Monday, February 20, 2006

Emacspeak And Voice Locking Using Aural CSS

This is slightly reformatted from what was posted to the Emacspeak mailing list as separate message.

  1. Emacspeak defines a number of voice overlays such as voice-bolden, and voice-lighten that can be applied to a given voice to change what it sounds like.
  2. Voice overlays are defined in terms of Aural CSS (ACSS) to keep them independent of a specific TTS engine.
  3. For each such overlay there is a corresponding <overlay-name>-settings variable that can be customized via custom.
  4. The numbers in voice-bolden-settings as an example:
Setting Value
family nil
average-pitch 1
pitch-range 6
stress 6
richness nil
punctuation nil
Unset values (nil) show up as "unspecified" in the customize interface.
  1. Do not directly customize voice-bolden and friends, instead customize the corresponding voice-bolden-settings, since that ensures that all voices that are defined in terms of voice-bolden get correctly updated.
  2. Discovering what to customize:

Command emacspeak-show-personality-at-point (bound by default to C-e M-v) will show you the value of properties personality and face at point. A recent update I implemented last weekend makes this more useful, so make sure you do a CVS update; earlier this command used to display the ACSS setting --- now it displays the abstract name. Describe-variable on these names should tell you what to customize; so as an example:

Put point on a comment line, and hit C-e M-v: you will hear

Personality emacspeak-voice-lock-comment-personality
Face font-lock-comment-delimiter-face

Describe-variable of emacspeak-voice-lock-comment-personality gives:

emacspeak-voice-lock-comment-personality's value is acss-p0-s0-all

Personality used for font-lock-comment-face
This personality uses  voice-monotone whose  effect can be changed globally by customizing voice-monotone-settings.

How It All Works

Here is a brief explanation of the connection between voice-bolden and its associated voice-bolden-settings.

  1. Voice settings are initially in voice-bolden-settings which is a list of numbers.
  2. That list of numbers needs to be translated to appropriate device-specific codes to send to the TTS engine.
  3. You do not want to do this translation each time you speak something.
  4. So when voice-bolden is defined, the definition happens in two steps:
  • The list of settings is stored away in voice-bolden-settings,
  • A corresponding voice-name is generated --- acss-a<n>-p<n>-r<n>-s<n> and the corresponding control codes to send to the device are stored away in a hash-table keyed by the above symbol.
  • Finally, voice-bolden is assigned the above symbol.

What this gives is:

  1. The ability to customize the voice via custom by editting the list of numbers in voice-bolden-settings
  2. When that list is editted, voice-bolden is arranged to be updated automatically.

Other Useful Commands

In addition, commands emacspeak-wizards-generate-voice-sampler can be useful in generating a buffer that shows what the various ACSS settings sound like. Command emacspeak-wizards-voice-sampler can be used to apply a specific voice to a region of text while experimenting with the various settings.

Saturday, February 11, 2006

Playing SuDoKu Using Auditory Feedback

Emacspeak speech-enables SuDoKu implemented by sudoku.el. Speech-enabling games is an effective means of discovering what additions one needs to make to an auditory interface for working effectively in an eyes-free environment --- this was aptly demonstrated a few years ago by identifying interesting conversational gestures by speech-enabling the game of Tetris --- see Conversational Gestures For The Audio Desktop from Assets 1998.

Advicing Interactive Commands

As with speech-enabling any Emacs module, emacspeak-sudoku advices all interactive commands to produce spoken feedback. In addition to speaking the cell moved to, all navigation commands produce an auditory icon that is a function of whether the cell value is mutable --- original values cannot be changed and this is indicated with a distinctive icon.

Additional Interactive Commands

Playing SuDoKu effectively requires one to build a good mental image of the state of the board as well as the ability to effectively query the game for currently active constraints. The eye's ability to quickly move around the board and perceive row, column and sub-square constraints needs to be compensated for in an eyes-free environment. As an example, it is too difficult to build the necessary mental model by just listening to the board spoken aloud, or by listening to idnividual cells by navigating to them.

Here are the set of additional interactive commands that needed to be added in order to be able to play the game effectively.

Speak current row.
Speak current column
Speak current sub-square.
Speak number of remaining cells in current row.
Speak number of remaining cells in current column.
Speak number of remaining cells in current sub-square.
Move to the sub-square below the current sub-square.
Move to the sub-square above the current sub-square.
Move to the next sub-square.
Move to the previous sub-square.
Move to the beginning of current row.
Move to the end of the current row.
Move to the top of the current column.
Move to the bottom of the current column.
Speaks information about the overall distribution of numbers on the board.
  • d --- Conveys how many instances of each digit have been filled in.
  • s --- Conveys number of remaining cells in each sub-square.
  • r --- Conveys number of remaining cells in each row.
  • c --- Conveys number of remaining cells in each column.
Speaks number of remaining cells in the current board.
Speaks value in current cell.

Notes on how invormation is spoken:

  • Numbers are spoken in groups of 3 to achieve effective intonation.
  • When navigating by sub-squares, point always moves to the top left corner of the sub-square.
  • Additional commands bound to M-r, M-c and M-s erase the current row, column or sub-square respectively. These commands would probably be convenient to have independent of whether one is using visual output.

Effectiveness Of The Resulting Interface

With the above interface in place, the simpler levels of the game are a breeze, levels difficult and evil are sufficiently challenging to be fun.

Friday, January 27, 2006

Browsing Sourceforge Download Servers

Sourceforge is a nice service, but it can also be painful to use because of the heavy-weight Web page design, and the need to repeatedly click before you get the download you want.

The most irksome of these is the download mechanism provided by Sourceforge --- where you first need to browse a list of download servers, pick a mirror, and then download what you want. Emacspeak implements a Smart URL that enables one to download from Sourceforge in a single step.

By default, this uses a North American mirror; the behavior can be customized if outside the US. Use smart URL Sourceforge Browse Mirror and specify the name of a SF hosted project when prompted. This brings up the index page for the project's download area, sorted by date. Move to the bottom of the page and hit b to move to the latest available download.

The smart URL sets up the W3 buffer with a context-sensitive download function; when on a download link, hit C-d to start downloading. This command will prompt for the URL; rather than hitting return (which would bring you to the browse mirrors page, hit M-p to get the download URL for your SF mirror. Note that this wizard uses GNU wget to perform the download via Emacs module w3-wget.

BBC Channels On Emacspeak

Since the BBC's various channels are what I listen to the most, launching BBC channels has always been a couple of keystrokes in Emacspeak. As a first step, directory realaudio/radio contains shortcut files for launching live streams from the various BBC channels.

In addition, module emacspeak-url-template defines a number of Smart URLs for single-click access to BBC programs. The ones I use the most are:

  • Smart URL BBC Channels On Demand, and
  • Smart URL BBC Genres On Demand

These smart URLs prompt for the channel or genre respectively and bring up a Web page that lists the various shows that are available --- note that the BBC archives shows for a whole week. The resulting Web page is easy to browse in W3; the most effective way to skim the buffer is to repeatedly hit i which moves through the various items on the page. Hitting e e (that's the letter e twice) while on a hyperlink will launch the corresponding media stream by calling a context-aware command that knows about transforming the URL to one that accesses the program stream; --- note that simply following the hyperlink will get you first to a page about the program, rather than to the program stream itself.

To find out what channels and genres are available, browse the BBC Web site --- channel and genre names are not hard-wired into Emacspeak since these can change over time with channels and genres being added or renamed.

Thursday, January 26, 2006

Emacspeak World Clock For Timezone Travel

Command emacspeak-speak-time bound to C-e t speaks the current time. An additional convenience offered by this keystroke is to get the time at a specified time zone using Emacs' completion facility.

To use this feature, simply precede the keystroke with an interactive prefix arg i.e., use C-u C-e t. This will prompt for the timezone in the minibuffer. Using two C-u C-u will set the default timezone after speaking the time --- a useful way of avoiding jet-lag as you travel.

Sunday, January 22, 2006

Emacspeak Web Wizards: Obtaining Context From The Calendar

Emacspeak implements a number of smart URLs in module emacspeak-url-template.el --- see earlier post on Web Command Line. Many of these smart URLs prompt the user for the date, e.g. you can use smart URL NPR On Demand to play archived NPR shows.

The most intuitive means of specifying a date is of course using a calendar that functions as a date-picker, and Emacs has a very powerful built-in calendar. Emacspeak ties these two together by arranging for commands that prompt for a date to use the current date in the Emacs Calendar as the default. So the easiest way to play NPR Morning Edition for Monday, January 2, 2006 is to do the following:

  • Switch to the Emacs Calendar and move to the desired date Monday January 2, 2006 by pressing gd.
  • Invoke the NPR On Demand smart URL by pressing C-e u RET NPR RET
  • Specify the program code for Morning Edition by pressing me RET
  • Hit enter to pick the default date that is offered in the minibuffer.
  • Sit back and listen ...

Tuesday, January 17, 2006

Viewing Atom Feeds Within Emacspeak

The most effective way of viewing Atom Feeds in Emacspeak is to use command emacspeak-atom-display and specifying the URL of the feed when prompted. Thus, M-x emacspeak-atom-display RET displays a Web page generated from the Emacspeak Blog.

Notice the following in the generated Web page:

  • It starts with a navigable table of contents.
  • Each Blog entry has a link labeled edit next to it.
  • Each Blog entry ends with a link labeled Bookmark.
  • There is a link labeled Post at the top of the page.

The above links help you easily create and edit posts to the Blog if you have write access using commands provided by module atom-blogger. Eventually, I may add commands to these hyperlinks to automatically invoke the appropriate command from atom-blogger; for now, I find it sufficiently convenient to copy the URL under point to the kill-ring and later yank it back into the minibuffer when prompted by atom-blogger.

Finally, note that this and subsequent posts to this Blog will show up automatically on the Emacspeak Mailing List at Vassar.

Viewing Formatted Source Code In Emacs/W3

While reading online texts on programming in Python and Ruby, I noticed that Emacspeak was not announcing indented lines in preformatted source-code examples, even with audio indentation turned on. The reason is that many of these texts use an HTML non-breaking space for indentation, and though W3 was rendering these correctly, the default syntax table in W3 had not defined the resulting octal 240 to be of class white-space. Consequently, Emacspeak's audio indentation code was not treating the non-breaking space as white space.

I've checked in a patch to emacspeak-w3.el that modifies the syntax table in w3-mode by adding the appropriate lines to w3-mode-hook.

Saturday, January 14, 2006

Speech-Enabled ATOM-Blogger

Module atom-blogger is a light-weight Emacs client for creating or editting blogger posts using ATOM. Emacspeak bundles atom-blogger and speech-enables it via module emacspeak-atom-blogger.

Module emacspeak-setup.el has been updated to set up the Emacs' load-path to locate package atom-blogger, so if correctly installed, Emacspeak users should be able to launch and use atom-blogger with no further configuration.

Thursday, January 12, 2006

Emacspeak And Ruby

Emacspeak now speech-enables ruby-mode to support developing Web applications using Ruby On Rails. I presently use nxml-mode for editing the .rhtml files, but am looking for an alternative to using multi-mode or its variants when editing the embedded Ruby code. Sadly, one has to turn off nxml-mode's validity checking while editing .rhtml files --- otherwise it complains about the <% directives.

Monday, January 02, 2006

Emacspeak Wizard: Recording Audio Streams For Later Playback

Emacspeak includes a large collection of wizards implemented in module emacspeak-wizards.el One of these ---emacspeak-wizrds-rivo works hand-in-hand with script etc/ to provide a simple record for later playback facility that can be used to record live realaudio streams for future playback. This is useful for listening to live broadcasts at a more convenient time.

Wizard emacspeak-wizrds-rivo prompts for the time at which to record, the length of the recording, the stream to record, and the location in which the recording is to be stored. It then uses command trplayer (text-mode RealPlayer) with command vsound to capture the audio stream, and converts the result to MP3 using command lame. ToDo: With mplayer now able to play RealAudio streams, the etc/ script should be updated to use mplayer since this will :

  • Remove the vsound dependency.
  • Enable us to record more than just RealAudio streams.