Monday, May 22, 2006

ASoundrc Parameters For Reliably Using ALSA Powered Software TTS

Advanced Linux Sound Architecture ALSA is a boon for software TTS users --- you can now use your soundcard to produce spoken output while not losing audio output from other applications such as music players and streaming radio stations.

Emacspeak implements an ALSA-enabled TTS server for the IBM ViaVoice engine --- using this server effectively requires appropriately tuning the parameters in the user's asoundrc file to:

  • Enable the DMix plugin to enable software mixing of multiple channels of audio.
  • To configure the various parameters ALSA itself uses.

Depending on how well your sound-card is supported by ALSA, the above can be either trivially simple or a tedious process of trial and error. I'm writing this up to:

  • Collect a list of sound cards on which the asoundrc provided with Emacspeak works as expected.
  • In the hope that the wider ALSA community discovers and helps flesh out this material; my hope is that the ALSA community has more insight into how these settings work.

For the above, works effectively means the following:

  • The TTS engine speaks without perceptible stuttering or other audio artifacts.
  • The engine is responsive with respect to starting and stopping speech; especially when typing fast at high speech rates.
  • The TTS engine does not interfere with other alsa-enabled applications, e.g. mplayer.

At the end of this entry, you can find the relevant section from the asoundrc file from the Emacspeak distribution, with comments indicating which sound cards perform well. An example of a card that does not work well with these settings is the Audigy-LS from Creative; the TTS engine works on that card, but performs degrades:

  • mplayer cannot use the audio device; (aplay and mpg321 are able to share the card with the TTS engine.)
  • Speech does not stop immediately as on the soundcards enumerated in the asoundrc file.
  Id: asoundrc,v 1.3 2006/05/23 00:22:16 raman Exp $
#these numbers work on the following:
# aplay -l | head 1
# I82801DBICH4 [Intel 82801DB-ICH4] (IBM Thinkpads)
# ICH6 [Intel ICH6],

#  default device is a mixer

pcm.!default {
    type plug
    slave.pcm "dmixer"
}

pcm.dmixer  {
    type dmix
    ipc_key 1024
    slave {
        pcm "hw:0,0"
        format s16_LE
        period_time 0
        period_size 1024
        buffer_size 4096
        rate 44100
    }
    bindings {
        0 0
        1 1
    }
}

Wednesday, May 03, 2006

Listening To The Web Through A Mobile Lens

The similarities between Web access issues faced by mobile users and those confronting eyes-free Web browsing are striking, and these similarities have often been used to advocate the creation of well-structured, accessible Web content. As an example of mobile-friendly content being a blessing for eyes-free spoken access to WebFormation, Emacspeak provides a mobile lens via the Google Mobile transcoder.

Here are a few convenient means of using the above within the Emacspeak Audio Desktop:

  • While browsing the Web using w3, press t on a link (command: emacspeak-w3-transcode-via-google) to view that link through the mobile transcoder.
  • Note that all links in the resulting mobile view automatically go through the transcoder.
  • To undo the effect of automatically viewing links in the mobile view through the transcoder, use t with a interactive prefix argument i.e., press C-u t to follow a link to view it in its original form.
  • Additionally, I bind command emacspeak-wizards-google-transcode to a convenient key so that I can launch Web sites using the mobile view.

I use this tool on a regular basis while commuting to work to browse mainstream news sites, it provides speech-friendly content that has the added benefit of downloading fast over a wireless link --- after all, this is Mobile content.

Tuesday, May 02, 2006

Announcing Emacspeak 24.0 (LiveDog)

For Immediate Release

San Jose, CA, (May 3, 2006)
Emacspeak-Alive: --- Bringing Live Access For Enlightened Users
--Zero cost of ownership makes priceless software affordable!

Major Enhancements

  1. emacspeak-muse: Speech-enabled Muse Mode
  2. emacspeak-ruby: Speech-enabled Ruby Mode
  3. emacspeak-m-player: Updated for new MPlayer
  4. emacspeak-sudoku.el: Speech-enabled SuDoKu
  5. New Option: tts-strip-octals
  6. emacspeak-keymap.el Updated keybindings
  7. lisp/atom-blogger.el Light-weight blogging tool
  8. emacspeak-atom-blogger: Speech-enables above
  9. voice-setup.el Custom support
  10. Multispeech related patches
  11. User contributed patches