Monday, May 22, 2006

ASoundrc Parameters For Reliably Using ALSA Powered Software TTS

Advanced Linux Sound Architecture ALSA is a boon for software TTS users --- you can now use your soundcard to produce spoken output while not losing audio output from other applications such as music players and streaming radio stations.

Emacspeak implements an ALSA-enabled TTS server for the IBM ViaVoice engine --- using this server effectively requires appropriately tuning the parameters in the user's asoundrc file to:

  • Enable the DMix plugin to enable software mixing of multiple channels of audio.
  • To configure the various parameters ALSA itself uses.

Depending on how well your sound-card is supported by ALSA, the above can be either trivially simple or a tedious process of trial and error. I'm writing this up to:

  • Collect a list of sound cards on which the asoundrc provided with Emacspeak works as expected.
  • In the hope that the wider ALSA community discovers and helps flesh out this material; my hope is that the ALSA community has more insight into how these settings work.

For the above, works effectively means the following:

  • The TTS engine speaks without perceptible stuttering or other audio artifacts.
  • The engine is responsive with respect to starting and stopping speech; especially when typing fast at high speech rates.
  • The TTS engine does not interfere with other alsa-enabled applications, e.g. mplayer.

At the end of this entry, you can find the relevant section from the asoundrc file from the Emacspeak distribution, with comments indicating which sound cards perform well. An example of a card that does not work well with these settings is the Audigy-LS from Creative; the TTS engine works on that card, but performs degrades:

  • mplayer cannot use the audio device; (aplay and mpg321 are able to share the card with the TTS engine.)
  • Speech does not stop immediately as on the soundcards enumerated in the asoundrc file.
  Id: asoundrc,v 1.3 2006/05/23 00:22:16 raman Exp $
#these numbers work on the following:
# aplay -l | head 1
# I82801DBICH4 [Intel 82801DB-ICH4] (IBM Thinkpads)
# ICH6 [Intel ICH6],

#  default device is a mixer

pcm.!default {
    type plug
    slave.pcm "dmixer"
}

pcm.dmixer  {
    type dmix
    ipc_key 1024
    slave {
        pcm "hw:0,0"
        format s16_LE
        period_time 0
        period_size 1024
        buffer_size 4096
        rate 44100
    }
    bindings {
        0 0
        1 1
    }
}