Wednesday, July 31, 2024

Emacspeak --- A Speech Odyssey

Emacspeak: A Speech Odyssey

1. Dedication: To My Guiding Eyes Aster, Hubbell and Tilden

Aster Labrador
(2/15/1987—12/05/1999)
 Hubbell Labrador
(12/21/1997—4/11/2011)
Tilden Labrador
(8/4/2009—9/3/2022)

2. Key Insights

  1. User interface is a means to an end.
  2. Open Source is essential for discovering new interaction paradigms.
  3. This is not mere idealism. Openness is a key enabler for creating user journeys that were not envisioned by a system’s designers.
  4. Emacs and TeX are good exemplars. They permit maximal freedom when seen from the viewpoint of user extensibility and creativity. TeX enabled Audio System For Technical Readings (AsTeR); Emacs enabled Emacspeak.
  5. Rapid, reliable task completion is the most important metric and trumps secondary items such as eye-candy — the latter only leads to bloat as evinced by the HTML Web.
  6. Having a well-identified problem when designing a system is paramount.
  7. Usability is important, but to matter, the system needs to be useful first.
  8. Ease of use by itself is often marketing hype.
  9. Useful systems are fun to learn and give back more than what you put in with respect to time and effort.
  10. A steep learning curve in and of itself is not to be feared — it can be fun to learn and gets you farther faster.
  11. True empowerment: Ensure that the user grows continuously.

3. Emacspeak — The Complete Audio Desktop

  1. Emacspeak, started in September 1994, was released as Open Source in April 1995.
  2. The goal was to create a system for daily use that doubled as a research work-bench for developing an auditory interface.
  3. Speech and auditory output would be treated as first-class citizens.
  4. The time felt right with respect to building a system that enabled eyes-free access to the emerging Web.
  5. Emacspeak At Twenty was published in September 2014 and traced the evolution of the project.
  6. Now, this article gives a birds-eye overview of the last 10 years by loosely following the logical structure of the Turning Twenty paper.
  7. In the process, we identify the dreams that have come to pass as well as the expectations that have failed to materialize — both attributable to developments in the larger Internet eco-system.
  8. But never fear, though some of these may be superficially disappointing, they likely herald the nature of bigger and better things to come!
  9. As a proof-point, in 1994, I could not have imagined the impact that the world of Internet-centered computing and the accompanying information revolution would have on the state of information access.
  10. Conversely, I boldly (and incorrectly) predicted that the arrival of mobile devices and mainstream speech interfaces would herald the move to a Web of information where there would be a clean separation between application back-ends and various client-specific front-ends. See Specialized Browsers and The Web, The Way You Want. Distinguished Lecture Series, UW Oct 2007.
  11. The above still makes sense from the view of scalable software architecture. However the rapid growth of the Web economy has also resulted in an even faster race to the bottom where applications continue to be built and re-built every few years for the next best thing — welcome to the write once, debug everywhere world all over again!
  12. Case in point; today we have smart phones, smart watches and smart speakers, but each of these require targeted front-ends if one wishes to bring the riches of the Internet to them.
  13. So the larger the Web gets, the fewer devices it becomes available on — a classic downward spiral.

Share And Enjoy — The Best Is Yet To Come!

3.1. How To Read This Document

  1. I recommend reading the Turning Twenty paper to get a full overview.
  2. Then, read this paper a section at a time, while referring back to the parallel section in the Turning Twenty paper to understand how things have evolved.
  3. Make sure to skim or deep-dive into the references in both papers.

4. Using UNIX With Speech Output — 2024

  1. In 2024 UNIX equates mostly to various Linux distributions, and from the Emacspeak perspective, they are all made mostly equal.
  2. Variations do exist and running bleeding-edge distributions can come with issues, e.g., unstable versions of the underlying audio infrastructure.
  3. Yes, 30 years and counting, Linux Audio is still a work in progress though I hope Pipewire will be the last of these tidal shifts.
  4. Linux is moving to Wayland and expect that transition to be choppy.
  5. Native applications are mostly gone bar the shouting. In this context, where most users access things through a mainstream Web browser, Emacspeak users access everything through Emacs.
  6. The above when done right is hugely empowering; when done badly, it’s extremely limiting — see later sections of this paper on the continuing evolution of the Web.

5. Key Enabler — Emacs And Lisp Advice

  1. Advice in Emacs as implemented in advice is rock-solid.
  2. There is a newer nadvice that is part of Emacs that Emacspeak does not use.
  3. There are no plans to migrate to nadvice since that is a lot of busy work in my view and any such migration would be difficult to test for correctness.
  4. The classic advice package may be removed from Emacs at some point in the future, but never fear; it’ll be bundled with Emacspeak if that becomes necessary. This is a feature of Free Software and is a great example of what that Freedom entails.

6. Key Component — Text To Speech (TTS)

  1. Speech output — especially unencumbered text-to-speech — is just as much a challenge as it was 30 years ago.
  2. In the bigger picture, early instances of using TTS for voice assistants has driven the industry toward natural sounding voices.
  3. The above sounds attractive on the surface, but a price we have paid is the loss of fine-grained control over voice parameters, emotion, stress and other supra-linguistic features.
  4. I believe these to be essential for delivering good auditory interfaces and remain optimistic that these will indeed arrive in a future iteration of speech interaction.
  5. Things appear to be coming full circle, Emacspeak started with the hardware Dectalk; now, the Software Dectalk is increasingly becoming the primary choice on Linux — see this Readme for setup instructions.
  6. Viavoice Outloud from Voxin is still supported. However, you can no longer buy new licenses. If you have already purchased a license, it’ll continue to work.
  7. The Vocalizer voices that Voxin now sells do not work with Emacspeak.
  8. The other choice on Linux is ESpeak which will hopefully continue to be free — albeit of much lower quality.
  9. The future as ever is unpredictable and new voices may well show up — especially those powered by on-device Large Language Models (LLMs).
  10. On non-free platforms, there is usable TTS on the Mac, now supported by the new SwiftMac server for Emacspeak.

7. Emacspeak And Software Development

  1. Magit as a Git porcelain is perhaps the biggest leap forward with respect to software development.
  2. New completion frameworks such as company and consult come a close second in enhancing productivity.
  3. Completion strategies such as fuzzy and flex provide enhanced completion.
  4. Effective Suggest And Complete In An Eyes-free Environment explains the higher-level concept involved in defining such strategies.
  5. The ability to introspect code via eglot turns Emacs into a powerful and meaningful IDE — I say meaningful because this brings the best features of an integrated development environment while leaving behind the eye-candy that tends to bloat commercial IDEs.
  6. Packages like transient enable discoverable, rapid keyboard access to complex nested-menu driven interfaces.
  7. Ergonomic keybindings under X using xcape to minimize chording has been a significant win in the last two years.
  8. Jupyter is the generalization of IPython notebooks to Julia, Python and R. The news here isn’t all good; IPython notebooks are well-designed with respect to not getting locked into any given implementation. However in practice, front-ends depend on Javascript in the browser.
  9. Consequently, Emacs packages for IPython Notebooks e.g., package ein, are no longer maintained.
  10. Developing in higher-level languages continues to be very well supported in Emacspeak.
  11. The re-emergence of Common Lisp in the last 20 years, thanks to asdf and quicklisp as a network-aware package manager and build tool has once again made Lisp development using Emacs Slime a productive experience.
  12. In 2022, I updated Audio System For Technical Readings (AsTeR)— my PhD project from 1993 — to run under SBCL with a freshly implemented Emacs front-end.
  13. So now I can listen to Math content just as well as I could 30 years ago!

8. Emacspeak And Authoring Documents

  1. Package org is to authoring as magit is to software development with respect to productivity gains.
  2. Org has existed since circa 2006 in my Emacs setup; but it continues to give and give plentifully.
  3. Where I once authored technical papers in LaTeX using auctex, used nxml for HTML, etc., I now mostly write everything in org-mode and export to the relevant target format.
  4. Integrating various search engines in Emacs makes authoring content extremely productive.
  5. Integrated access to spell-checking (flyspell) dictionaries, translation engines, and other language tools combine for a powerful authoring work-bench.
  6. Extending org-mode with custom link types enables smart note taking with hyperlinks to relevant portions of an audio stream — see article Learn Smarter By Taking Rich Hypertext Notes.

9. Emacspeak And The Web In 2024

  1. Package shr and eww arrived around 2014. But in 2024, they can be said to have truly landed.
  2. 2014 also marked the explicit take-over of the stewardship of the HTML Web by the browser vendors from the W3C — I say explicit — because the W3C had already thrown in the towel in the preceding decade.
  3. This has led to a Web of content created using the assembly language of divs, spans and Javascript under the flag of HTML5 — the result is a tangled web of spaghetti that everyone loves to hate.
  4. In this context, see Tag Soup, Scripts And Obfuscation: How The Web Was Broken for a good overview of HTML’s obesity problem.
  5. For better or worse, the investment in XML and display-independent content is now a complete write-off at least on the surface.
  6. So what next — wait for the spaghetti monster to show up for lunch? Humor aside that monster may well be called AI — though whether today’s Web gives that monster life, indigestion, constipation, dysentery or hallucinations is a story to be written in the coming years.
  7. I say on the surface above because The welcome re-emergence of ATOM and RSS feeds is perhaps a silent acknowledgement that bloated Web pages are now unusable even for users who can see.
  8. Package elfeed has emerged as a powerful feed-manager for Emacs.
  9. Emacspeak implements RSS and ATOM support using XSLT; those features now shine brighter with mainstream news sites reviving their support for content feeds.
  10. Browsers like Mozilla now implement content filters — a euphemism for scraping off visual eye-candy and related cruft to reveal the underlying content. These are now available as plugins, (see RDRView for an example). Emacspeak leverages this to make the Web more readable.
  11. Package url-template and emacspeak-websearch continue to give in plenty, though they do require continuous updating.
  12. Web APIs come and go, so that space is in a state of constant change.
  13. The state of web applications is perhaps the most concerning from an Emacspeak perspective, and I do not see that changing in the short-term. There are no incentives for Web providers to free their applications from the tangled Web of spaghetti they have woven around themselves.
  14. But as with everything else in our industry, it is precisely when something feels completely entrenched that users rebel and innovations emerge to move us to the next phase — so fingers crossed.

10. Audio Formatting — Generalizing Aural CSS

  1. Audio formatting with Aural CSS support is stable, with new enhancements supporting more TTS engines.
  2. Support for parallel streams of TTS using separate outputs to left/right channels is a big win and enables more efficient interaction.
  3. Support for various Digital Signal Processing (DSP) filters enables rich auditory effects like binaural audio and spatial audio.
  4. Soundscapes implemented via package boodler makes for a pleasant and relaxing auditory environment.
  5. Enabling virtual sound devices via Pipewire for 5.1 and 7.1 spatial audio significantly enhances the auditory experience.

11. Conversational Gestures For The Audio Desktop

  1. Parallel streams of audio, combined with more ergonomic keybindings are the primary enhancement in this area.
  2. Parallel streams of speech, e.g., a separate notification stream on the left or right ear help increase the band-width of communication.
  3. Notifications can thus be delivered without having to stop the primary speech output.

12. Accessing Media Streams

  1. Emacspeak support for rich multimedia is now much more robust.
  2. Emacs package empv is a powerful tool for locating, organizing and playing local and remote media streams ranging from music, audio books, radio stations and Podcasts.
  3. This makes media streams from a large number of providers ranging from the BBC to Youtube available via a consistent keyboard interface.
  4. This experience is augmented by a collection of smart content locators on the Emacspeak desktop, see the relevant blog article titled smart media selectors.

13. Electronic Books— Ubiquitous Access To Books

  1. Emacspeak modules for Epub and Bookshare continue to provide good books integration.
  2. There are smart book locators analogous to the locators for media content.
  3. Emacspeak speech-enables Calibre for working with local electronic libraries.

14. Leveraging Computational Tools — From SQL And R To IPython Notebooks

  1. This area continues to provide a rich collection of packages.
  2. Newer highlights include sage interaction for symbolic computation.
  3. Emacspeak speech-enables packages gptel and ellama for working with local and network LLMs.

15. Social Web — Mail, Messaging And Blogging

  1. This is a space that is definitely regressing.
  2. The previous decade was marked by open APIs to many social Web platforms.
  3. Over time these first regressed with respect to privacy.
  4. Then they turned into wall-gardens in their own right.
  5. Finally, the Web APIs, other than the kind embedded in Javascript have started disappearing.
  6. Looking back, the only social platform I now use is Blogger for hosting my Emacspeak Blog, it has a somewhat usable API, albeit guarded by a difficult to use OAuth interface that requires signing in via a mainstream browser.
  7. IMap continues to survive as an open email protocol, though its days may well be numbered.
  8. The dye is already cast with respect to mere mortals being able to setup and host their email — witness the complexity in setting up the Emacspeak mailing list in 2023 vs 1993!
  9. This is an area that is likely to get worse before it gets better, thanks to the spammers — more’s the pity, since Internet Email is perhaps the single-most impactful technology with respect to leveling the communications playing field.
  10. The disappearance of APIs mentioned above also means that today the only usable chat service on an open platform like Emacspeak is the venerable Internet Relay Chat (IRC).

16. The RESTful Web — Web Wizards And URL Templates For Faster Access

  1. This area continues to thrive — either because of — or despite — the best and worst efforts of application providers on the Web.
  2. Twenty years on (this feature originally landed in 2000) Emacspeak has a far richer collection of filters, preprocessors and post-processors that enables ever-more powerful Web wizards. See the relevant chapter in the Emacspeak manual for the automatically updated list of URL Templates.

17. Mashing It Up — Leveraging AI And The Web

  1. Developing solutions by combining various API-based services on the Web has all but disappeared, unless one is willing to commit fully to the Javascript-powered Web hosted in a Web browser, something I hope I never have to accept.
  2. So for now, I’ll keep well away and count my blessings.
  3. The next chapter of the mash-up story may well be based around Generative AI using LLMs. In effect, LLMs trained on Web content define a platform for generating content mash-ups. The issue at present is that they are just as likely to produce meaningless mush — something that may get better as the field gets a handle on cleaning up Web content.
  4. Notice that we are now back to the previously unsolved problem of cleaning up the HTML Web — with LLMs, we’ll just have an order of magnitude more documents than the 2W postulated by Beyond Web 2.0, Communications Of The ACM, 2009.

18. The Final Word — Donald E Knuth (DEK)

  • The best theory is inspired by practice. The best practice is inspired by theory.
  • The enjoyment of one’s tools is an essential ingredient of successful work.
  • Easy things are often amusing and relaxing, but their value soon fades. Greater pleasure, deeper satisfaction, and higher wages are associated with genuine accomplishments, with the successful fulfillment of a challenging task.
  • Computer Programming Is An Art.

The best example of the above is of course Knuth’s TeX — work that was motivated by his own dissatisfaction with the tools available to him at the time for typesetting his magnum opus — The Art Of Computer Programming (TAOCP). It is something I’ve looked up to ever since my time as a graduate student at Cornell.

The Emacspeak Speech Odyssey outlined in this paper is, in some small measure, my own personal experience of the sentiments he expresses.

–T. V. Raman, San Jose, CA, August 1, 2024.

Friday, May 03, 2024

Emacspeak 60.0 (DreamDog) Unleashed!

1. For Immediate Release:

San Jose, CA, (May 4, 2024)

1.1. Emacspeak 60.0 (DreamDog) Unleashed! ๐Ÿฆฎ

— Innate Intelligence (II)™ Makes Accessible Computing a dream!

Advancing Accessibility In The Age Of User-Aware Interfaces — Zero cost of Ownership makes priceless software Universally affordable!

Emacspeak Inc (NASDOG: ESPK) — http://github.com/tvraman/emacspeak — announces immediate world-wide availability of Emacspeak 60.0 (DreamDog) ๐Ÿฆฎ — a non-LLM powered and innately intelligent audio desktop that leverages today's evolving Data, Social and Assistant-Oriented Internet cloud to enable working efficiently and effectively from anywhere!

2. Investors Note:

With several prominent tweeters (and mythical elephants) expanding coverage of #emacspeak, NASDOG: ESPK has now been consistently trading over the social net at levels close to that once attained by DogCom high-fliers—and is trading at levels close to that achieved by once better known stocks in the tech sector.

3. What Is It?

Emacspeak is a fully functional audio desktop that provides complete eyes-free access to all major 32 and 64 bit operating environments. By seamlessly blending live access to all aspects of the Internet such as ubiquitous assistance, Web-surfing, blogging, remote software development, streaming media, social computing, AI-Tools and electronic messaging into the audio desktop, Emacspeak enables spoken access to local and remote information with a consistent and well-integrated user interface. A rich suite of task-oriented tools provides efficient speech-enabled access to the evolving assistant-oriented social Internet cloud.

3.1. Major Enhancements:

  1. EMPV: Emacspeak And MPV ๐Ÿ“บ๐Ÿ”ˆ
  2. Updated Keybindings ⌨
  3. BBC Sounds: Search, Play and Download ๐Ÿ“ป
  4. Supports HTML5 Audio/Video Tag In EWW ๐Ÿ•ท๐Ÿ”ˆ
  5. Extract readable views in EWW using RDRView if available ๐ŸฆŠ
  6. Piper: Neural-Net TTS ๐ŸŽบ
  7. Speech-Enabled LLM Front-Ends, ellama and gptel . ๐Ÿง™
  8. Pipewire Support ๐Ÿ”Š
  9. New Mac Speech Server: SwiftMac ๐Ÿ
  10. Tree-Sitter Support ๐ŸŽ„
  11. Smart Media Selector ๐Ÿ—„
  12. Smart EBook Selector๐Ÿ“š
  13. Speech-Enable EBuku ๐Ÿ”–

— And a lot more than will fit this margin. … ๐Ÿ—ž

Note: This version requires emacs-29.1 or later.

Announcing Emacspeak 60.0—DreamDog!

Tilden

To express oneself well is impactful, but only when one has something impactful to express! (TVR on Conversational Interfaces)

1. For Immediate Release:

San Jose, CA, (May 4, 2024)

1.1. Emacspeak 60.0 (DreamDog) Unleashed! ๐Ÿฆฎ

— Innate Intelligence (II)™ Makes Accessible Computing a dream!

Advancing Accessibility In The Age Of User-Aware Interfaces — Zero cost of Ownership makes priceless software Universally affordable!

Emacspeak Inc (NASDOG: ESPK) — http://github.com/tvraman/emacspeak — announces immediate world-wide availability of Emacspeak 60.0 (DreamDog) ๐Ÿฆฎ — a non-LLM powered and innately intelligent audio desktop that leverages today's evolving Data, Social and Assistant-Oriented Internet cloud to enable working efficiently and effectively from anywhere!

2. Investors Note:

With several prominent tweeters (and mythical elephants) expanding coverage of #emacspeak, NASDOG: ESPK has now been consistently trading over the social net at levels close to that once attained by DogCom high-fliers—and is trading at levels close to that achieved by once better known stocks in the tech sector.

3. What Is It?

Emacspeak is a fully functional audio desktop that provides complete eyes-free access to all major 32 and 64 bit operating environments. By seamlessly blending live access to all aspects of the Internet such as ubiquitous assistance, Web-surfing, blogging, remote software development, streaming media, social computing, AI-Tools and electronic messaging into the audio desktop, Emacspeak enables spoken access to local and remote information with a consistent and well-integrated user interface. A rich suite of task-oriented tools provides efficient speech-enabled access to the evolving assistant-oriented social Internet cloud.

3.1. Major Enhancements:

  1. EMPV: Emacspeak And MPV ๐Ÿ“บ๐Ÿ”ˆ
  2. Updated Keybindings ⌨
  3. BBC Sounds: Search, Play and Download ๐Ÿ“ป
  4. Supports HTML5 Audio/Video Tag In EWW ๐Ÿ•ท๐Ÿ”ˆ
  5. Extract readable views in EWW using RDRView if available ๐ŸฆŠ
  6. Piper: Neural-Net TTS ๐ŸŽบ
  7. Speech-Enabled LLM Front-Ends, ellama and gptel . ๐Ÿง™
  8. Pipewire Support ๐Ÿ”Š
  9. New Mac Speech Server: SwiftMac ๐Ÿ
  10. Tree-Sitter Support ๐ŸŽ„
  11. Smart Media Selector ๐Ÿ—„
  12. Smart EBook Selector๐Ÿ“š
  13. Speech-Enable EBuku ๐Ÿ”–

— And a lot more than will fit this margin. … ๐Ÿ—ž

Note: This version requires emacs-29.1 or later.

4. Establishing Liberty, Equality And Freedom:

Never a toy system, Emacspeak is voluntarily bundled with all major Linux distributions. Though designed to be modular, distributors have freely chosen to bundle the fully integrated system without any undue pressure—a documented success for the integrated innovation embodied by Emacspeak. As the system evolves, both upgrades and downgrades continue to be available at the same zero-cost to all users. The integrity of the Emacspeak codebase is ensured by the reliable and secure Linux platform and the underlying GIT versioning software used to develop and distribute the system.

Extensive studies have shown that thanks to these features, users consider Emacspeak to be absolutely priceless. Thanks to this wide-spread user demand, the present version remains free of cost as ever—it is being made available at the same zero-cost as previous releases.

At the same time, Emacspeak continues to innovate in the area of eyes-free Assistance and social interaction and carries forward the well-established Open Source tradition of introducing user interface features that eventually show up in luser environments.

On this theme, when once challenged by a proponent of a crash-prone but well-marketed mousetrap with the assertion "Emacs is a system from the 70's", the creator of Emacspeak evinced surprise at the unusual candor manifest in the assertion that it would take popular idiot-proven interfaces until the year 2070 to catch up to where the Emacspeak audio desktop is today. Industry experts welcomed this refreshing breath of Courage Certainty and Clarity (CCC) at a time when users are reeling from the Fear Uncertainty and Doubt (FUD) unleashed by complex software systems backed by even more convoluted press releases.

5. Independent Test Results:

Independent test results have proven that unlike some modern (and not so modern) software, Emacspeak can be safely uninstalled without adversely affecting the continued performance of the computer. These same tests also revealed that once uninstalled, the user stopped functioning altogether. Speaking with Aster Labrador, the creator of Emacspeak once pointed out that these results re-emphasize the user-centric design of Emacspeak; “It is the user — and not the computer– that stops functioning when Emacspeak is uninstalled!”.

5.1. Note from Aster,Bubbles and Tilden:

UnDoctored Videos Inc. is looking for volunteers to star in a video demonstrating such complete user failure.

6. Obtaining Emacspeak:

Emacspeak can be downloaded from GitHub — see https://github.com/tvraman/emacspeak you can visit Emacspeak on the WWW at http://emacspeak.sf.net. You can subscribe to the emacspeak mailing list — emacspeak@emacspeak.net. The Emacspeak Blog is a good source for news about recent enhancements and how to use them.

The latest development snapshot of Emacspeak is always available at GitHub.

7. History:

  • Innately intelligent, Emacspeak 60 delivers a more idealized auditory environment.
  • Emacspeak 59 delivers better ergonomics by minimizing the need for chording, but sadly, with no dog to guide its way.
    • Emacspeak 58 delivers better ergonomics by minimizing the need for chording.
    • Emacspeak 57.0 is named in honor of Tilden Labrador.
    • Emacspeak 56.0 (AgileDog) belies its age to be as agile as Tilden.
    • Emacspeak 55.0 (CalmDog) attempts to be as calm as Tilden.
    • Emacspeak 54.0 (EZDog) learns to take it easy from Tilden.
    • Emacspeak 53.0 (EfficientDog) focuses on efficiency.
    • Emacspeak 52.0 (WorkAtHomeDog) makes working remotely a pleasurable experience.
    • Bigger and more powerful than any smart assistAnt, AssistDog provides

instant access to the most relevant information at all times.

  • Emacspeak 50.0 (SageDog) embraces the wisdom of stability as opposed to rapid change and the concomitant creation of bugs.๐Ÿšญ: Naturally Intelligent (NI)™ at how information is spoken, Emacspeak

is entirely free of Artificial Ingredients (AI)™.

  • Emacspeak 49.0 (WiseDog) leverages the wisdom gleaned from earlier releases to provide an enhanced auditory experience.
  • Emacspeak 48.0 (ServiceDog) builds on earlier releases to provide continued end-user value.
  • Emacspeak 47.0 (GentleDog) goes the next step in being helpful while letting users learn and grow.
  • Emacspeak 46.0 (HelpfulDog) heralds the coming of Smart Assistants.
  • Emacspeak 45.0 (IdealDog) is named in recognition of Emacs' excellent integration with various programming language environments — thanks to this, Emacspeak is the IDE of choice for eyes-free software engineering.
  • Emacspeak 44.0 continues the steady pace of innovation on the audio desktop.
  • Emacspeak 43.0 brings even more end-user efficiency by leveraging the ability to spatially place multiple audio streams to provide timely auditory feedback.
  • Emacspeak 42.0 while moving to GitHub from Google Code continues to innovate in the areas of auditory user interfaces and efficient, light-weight Internet access.
  • Emacspeak 41.0 continues to improve on the desire to provide not just equal, but superior access — technology when correctly implemented can significantly enhance the human ability.
  • Emacspeak 40.0 goes back to Web basics by enabling efficient access to large amounts of readable Web content.
  • Emacspeak 39.0 continues the Emacspeak tradition of increasing the breadth of user tasks that are covered without introducing unnecessary bloatware.
  • Emacspeak 38.0 is the latest in a series of award-winning releases from Emacspeak Inc.
  • Emacspeak 37.0 continues the tradition of delivering robust software as reflected by its code-name.
  • Emacspeak 36.0 enhances the audio desktop with many new tools including full EPub support — hence the name EPubDog.
  • Emacspeak 35.0 is all about teaching a new dog old tricks — and is aptly code-named HeadDog in on of our new Press/Analyst contact. emacspeak-34.0 (AKA Bubbles) established a new beach-head with respect to rapid task completion in an eyes-free environment.
  • Emacspeak-33.0 AKA StarDog brings unparalleled cloud access to the audio desktop.
  • Emacspeak 32.0 AKA LuckyDog continues to innovate via open technologies for better access.
  • Emacspeak 31.0 AKA TweetDog — adds tweeting to the Emacspeak desktop.
  • Emacspeak 30.0 AKA SocialDog brings the Social Web to the audio desktop—you can't but be social if you speak!
  • Emacspeak 29.0—AKAAbleDog—is a testament to the resilliance and innovation embodied by Open Source software—it would not exist without the thriving Emacs community that continues to ensure that Emacs remains one of the premier user environments despite perhaps also being one of the oldest.
  • Emacspeak 28.0—AKA PuppyDog—exemplifies the rapid pace of development evinced by Open Source software.
  • Emacspeak 27.0—AKA FastDog—is the latest in a sequence of upgrades that make previous releases obsolete and downgrades unnecessary.
  • Emacspeak 26—AKA LeadDog—continues the tradition of introducing innovative access solutions that are unfettered by the constraints inherent in traditional adaptive technologies.
  • Emacspeak 25 —AKA ActiveDog —re-activates open, unfettered access to online information.
  • Emacspeak-Alive —AKA LiveDog —enlivens open, unfettered information access with a series of live updates that once again demonstrate the power and agility of open source software development.
  • Emacspeak 23.0 — AKA Retriever—went the extra mile in fetching full access.
  • Emacspeak 22.0 —AKA GuideDog —helps users navigate the Web more effectively than ever before.
  • Emacspeak 21.0 —AKA PlayDog —continued the Emacspeak tradition of relying on enhanced productivity to liberate users.
  • Emacspeak-20.0 —AKA LeapDog —continues the long established GNU/Emacs tradition of integrated innovation to create a pleasurable computing environment for eyes-free interaction.
  • emacspeak-19.0 –AKA WorkDog– is designed to enhance user productivity at work and leisure.
  • Emacspeak-18.0 –code named GoodDog– continued the Emacspeak tradition of enhancing user productivity and thereby reducing total cost of ownership.
  • Emacspeak-17.0 –code named HappyDog– enhances user productivity by exploiting today's evolving WWW standards.
  • Emacspeak-16.0 –code named CleverDog– the follow-up to SmartDog– continued the tradition of working better, faster, smarter.
  • Emacspeak-15.0 –code named SmartDog–followed up on TopDog as the next in a continuing series of award-winning audio desktop releases from Emacspeak Inc.
  • Emacspeak-14.0 –code named TopDog–was

the first release of this millennium.

  • Emacspeak-13.0 –codenamed YellowLab– was the closing release of the 20th. century.
  • Emacspeak-12.0 –code named GoldenDog– began leveraging the evolving semantic WWW to provide task-oriented speech access to Webformation.
  • Emacspeak-11.0 –code named Aster– went the final step in making Linux a zero-cost Internet access solution for blind and visually impaired users.
  • Emacspeak-10.0 –(AKA Emacspeak-2000) code named WonderDog– continued the tradition of award-winning software releases designed to make eyes-free computing a productive and pleasurable experience.
  • Emacspeak-9.0 –(AKA Emacspeak 99) code named BlackLab– continued to innovate in the areas of speech interaction and interactive accessibility.
  • Emacspeak-8.0 –(AKA Emacspeak-98++) code named BlackDog– was a major upgrade to the speech output extension to Emacs.
  • Emacspeak-95 (code named Illinois) was released as OpenSource on the Internet in May 1995 as the first complete speech interface to UNIX workstations. The subsequent release, Emacspeak-96 (code named Egypt) made available in May 1996 provided significant enhancements to the interface. Emacspeak-97 (Tennessee) went further in providing a true audio desktop. Emacspeak-98 integrated Internetworking into all aspects of the audio desktop to provide the first fully interactive speech-enabled WebTop.

8. About Emacspeak:

Originally based at Cornell (NY) — http://www.cs.cornell.edu/home/raman —home to Auditory User Interfaces (AUI) on the WWW, Emacspeak is now maintained on GitHub —https://github.com/tvraman/emacspeak. The system is mirrored world-wide by an international network of software archives and bundled voluntarily with all major Linux distributions. On Monday, April 12, 1999, Emacspeak became part of the Smithsonian's Permanent Research Collection on Information Technology at the Smithsonian's National Museum of American History.

The Emacspeak mailing list is archived at Emacspeak Mail Archive –the home of the Emacspeak mailing list– thanks to Greg Priest-Dorman, and provides a valuable knowledge base for new users.

9. Press/Analyst Contact: Tilden Labrador

Going forward, Aster, Hubbell and Tilden acknowledge their exclusive monopoly on setting the direction of the Emacspeak Audio Desktop (๐Ÿฆฎ) and promise to exercise their freedom to innovate and her resulting power responsibly (as before) in the interest of all dogs.

*About This Release:


Windows-Free (WF) is a favorite battle-cry of The League Against Forced Fenestration (LAFF). –see http://www.usdoj.gov/atr/cases/f3800/msjudgex.htm for details on the ill-effects of Forced Fenestration.

CopyWrite )C( Aster, Hubbell and Tilden Labrador. All Writes Reserved. HeadDog (DM), LiveDog (DM), GoldenDog (DM), BlackDog (DM) etc., are Registered Dogmarks of Aster, Hubbell and Tilden Labrador. All other dogs belong to their respective owners.

Announcing Emacspeak 60.0 (DreamDog)

Tuesday, March 19, 2024

Updated: Smart Media Selector For The Audio Desktop

Smart Media Selector For The Audio Desktop

1. Overview

I have over 60GB of audio content on my laptop spread across 755 subdirecories in over 9100 files. I also have many Internet stream shortcuts that I listen to on a regular basis.

This blog article outlines the media selector implementation in Emacspeak and shows how a small amount of Lisp code built atop Emacs' built-in affordances of completion provides a light-weight yet efficient interface. Notice that the implementation does not involve fancy things like SQL databases, MP3 tags that one needs to update etc.; the solution relies on the speed of today's laptops, especially given the speed of disk access.

2. User Experience

As I type this up, the set of requirements as expressed in English is far more verbose (and likely more complicated) than its expression in Lisp!

2.1. Pre-requisites for content selection and playback

  1. Launch either MPV (via package empv.el) or mplayer via Emacspeak's emacspeak-mplayer with a few keystrokes.
  2. Media selection uses ido with fuzzy matching.
  3. Choices are filtered incrementally for efficient eyes-free interaction; see the relevant blog article on Search, Input, Filter, Target for additional background.
  4. Content can be filtered using the directory structure, where directories conceptually equate to music albums, audio books or othre logical content groups.Once selected, a directory and its contents are played as a conceptual play-list.
  5. Searching and filtering can also occur across the list of all 9,100+ media files spread across 700+ directories.
  6. Starting point of the SIFT process should be influenced by one's current context, e.g., default-directory.
  7. Each step of this process should have reasonable fallbacks.

3. Mapping Design To Implementation

  1. Directory where we start AKA context is selected by function emacspeak-media-guess-directory.
    1. If default directory matches emacspeak-media-directory-regexp,use it.
    2. If default directory contains media files, then use it.
    3. If default directory contains directory emacspeak-media — then use it.
    4. Otherwise use emacspeak-media-shortcuts as the fallback.
  2. Once we have selected the context, function emacspeak-media-read-resourceuses ido style interaction with fuzzy-matching to pick the file to play.
  3. That function uses Emacs' built-in directory-files-recursively to build the collection to hand-off to completing-read; It uses an Emacspeak provided function ems–subdirs-recursively to build up the list of 755+ sub-directories that live under $XDGMUSICDIR.

4. Resulting Experience

  1. I can pick the media to play with a few keystrokes.
  2. I use Emacs' repeat-mode to advantage whereby I can quickly change volume etc once content is playing before going back to work.
  3. There's no media-player UI to get in my way while working, but I can stop playing media with a single keystroke.
  4. Most importantly, I dont have to tag media, maintain databases or do other busy work to be able to launch the media that I want!

5. The Lisp Code

The hyperlinks to the Emacspeak code-base are the source of truth. I'll include a snapshot of the functions mentioned above for completeness.

5.1. Guess Context

  (defun emacspeak-media-guess-directory ()
  "Guess media directory.
1. If default directory matches emacspeak-media-directory-regexp,use it.
2.  If default directory contains media files, then use it.
3. If default directory contains directory emacspeak-media --- then use it.
4. Otherwise use emacspeak-media-shortcuts as the fallback."
  (cl-declare (special emacspeak-media-directory-regexp
                       emacspeak-media emacspeak-m-player-hotkey-p))
  (let ((case-fold-search t))
    (cond
     ((or (eq major-mode 'dired-mode) (eq major-mode 'locate-mode)) nil)
     (emacspeak-m-player-hotkey-p   emacspeak-media-shortcuts)
     ((or                               ;  dir  contains media:
       (string-match emacspeak-media-directory-regexp default-directory)
       (directory-files default-directory   nil emacspeak-media-extensions))
      default-directory)
     ((file-in-directory-p emacspeak-media default-directory) emacspeak-media)
     (t   emacspeak-media-shortcuts))))

5.2. Read Resource

(defun emacspeak-media-read-resource (&optional prefix)
  "Read resource from minibuffer.
If a dynamic playlist exists, just use it."
  (cl-declare (special emacspeak-media-dynamic-playlist
                       emacspeak-m-player-hotkey-p))
  (cond
   (emacspeak-media-dynamic-playlist nil) ; do nothing if dynamic playlist
   (emacspeak-m-player-hotkey-p (emacspeak-media-local-resource prefix))
   (t                               ; not hotkey, not dynamic playlist
    (let* ((completion-ignore-case t)
           (read-file-name-completion-ignore-case t)
           (filename
            (when (memq major-mode '(dired-mode locate-mode))
              (dired-get-filename 'local 'no-error)))
           (dir (emacspeak-media-guess-directory))
           (collection
            (or
             filename                   ; short-circuit expensive call
             (if prefix
                 (ems--subdirs-recursively  dir) ;list dirs
               (directory-files-recursively dir emacspeak-media-extensions)))))
      (or filename (completing-read "Media: "  collection))))))

5.3. Helper: Recursive List Of Sub-directories

  ;;; Helpers: subdirs


(defconst ems--subdirs-filter
  (eval-when-compile
    (concat (regexp-opt '("/.." "/." "/.git")) "$"))
  "Pattern to filter out dirs during traversal.")

(defsubst ems--subdirs (d)
  "Return list of subdirs in directory d"
  (cl-remove-if-not #'file-directory-p (cddr (directory-files d 'full))))

(defun ems--subdirs-recursively (d)
  "Recursive list of  subdirs"
  (cl-declare (special ems--subdirs-filter))
  (let ((result (list d))
        (subdirs (ems--subdirs d)))
    (cond
     ((string-match ems--subdirs-filter d) nil)                              ; pass
     (t
      (cl-loop
       for dir in subdirs
       if (not (string-match ems--subdirs-filter dir)) do
       (setq result  (nconc result (ems--subdirs-recursively dir))))))
    result))


Monday, March 18, 2024

Smart Media Selector For The Emacspeak Audio Desktop

Smart Media Selector For The Audio Desktop

1. Overview

I have over 60MB of audio content on my laptop spread across 755 subdirecories in over 9100 files. I also have many Internet stream shortcuts that I listen to on a regular basis.

This blog article outlines the media selector implementation in Emacspeak and shows how a small amount of Lisp code built atop Emacs' built-in affordances of completion provides a light-weight yet efficient interface. Notice that the implementation does not involve fancy things like SQL databases, MP3 tags that one needs to update etc.; the solution relies on the speed of today's laptops, especially given the speed of disk access.

2. User Experience

As I type this up, the set of requirements as expressed in English is far more verbose (and likely more complicated) than its expression in Lisp!

2.1. Pre-requisites for content selection and playback

  1. Launch either MPV (via package empv.el) or mplayer via Emacspeak's emacspeak-mplayer with a few keystrokes.
  2. Media selection uses ido with fuzzy matching.
  3. Choices are filtered incrementally for efficient eyes-free interaction; see the relevant blog article on Search, Input, Filter, Target for additional background.
  4. Content can be filtered using the directory structure, where directories conceptually equate to music albums, audio books or othre logical content groups.Once selected, a directory and its contents are played as a conceptual play-list.
  5. Searching and filtering can also occur across the list of all 9,100+ media files spread across 700+ directories.
  6. Starting point of the SIFT process should be influenced by one's current context, e.g., default-directory.
  7. Each step of this process should have reasonable fallbacks.

3. Mapping Design To Implementation

  1. Directory where we start AKA context is selected by function emacspeak-media-guess-directory.
    1. If default directory matches emacspeak-media-directory-regexp,use it.
    2. If default directory contains media files, then use it.
    3. If default directory contains directory emacspeak-media — then use it.
    4. Otherwise use emacspeak-media-shortcuts as the fallback.
  2. Once we have selected the context, function emacspeak-media-read-resourceuses ido style interaction with fuzzy-matching to pick the file to play.
  3. That function uses Emacs' built-in directory-files-recursively to build the collection to hand-off to completing-read; It uses an Emacspeak provided function ems–subdirs-recursively to build up the list of 755+ sub-directories that live under $XDGMUSICDIR.

4. Resulting Experience

  1. I can pick the media to play with a few keystrokes.
  2. I use Emacs' repeat-mode to advantage whereby I can quickly change volume etc once content is playing before going back to work.
  3. There's no media-player UI to get in my way while working, but I can stop playing media with a single keystroke.
  4. Most importantly, I dont have to tag media, maintain databases or do other busy work to be able to launch the media that I want!

5. The Lisp Code

The hyperlinks to the Emacspeak code-base are the source of truth. I'll include a snapshot of the functions mentioned above for completeness.

5.1. Guess Context

  (defun emacspeak-media-guess-directory ()
  "Guess media directory.
1. If default directory matches emacspeak-media-directory-regexp,use it.
2.  If default directory contains media files, then use it.
3. If default directory contains directory emacspeak-media --- then use it.
4. Otherwise use emacspeak-media-shortcuts as the fallback."
  (cl-declare (special emacspeak-media-directory-regexp
                       emacspeak-media emacspeak-m-player-hotkey-p))
  (let ((case-fold-search t))
    (cond
     ((or (eq major-mode 'dired-mode) (eq major-mode 'locate-mode)) nil)
     (emacspeak-m-player-hotkey-p   emacspeak-media-shortcuts)
     ((or                               ;  dir  contains media:
       (string-match emacspeak-media-directory-regexp default-directory)
       (directory-files default-directory   nil emacspeak-media-extensions))
      default-directory)
     ((file-in-directory-p emacspeak-media default-directory) emacspeak-media)
     (t   emacspeak-media-shortcuts))))

5.2. Read Resource

(defun emacspeak-media-read-resource (&optional prefix)
  "Read resource from minibuffer.
If a dynamic playlist exists, just use it."
  (cl-declare (special emacspeak-media-dynamic-playlist
                       emacspeak-m-player-hotkey-p))
  (cond
   (emacspeak-media-dynamic-playlist nil) ; do nothing if dynamic playlist
   (emacspeak-m-player-hotkey-p (emacspeak-media-local-resource prefix))
   (t                               ; not hotkey, not dynamic playlist
    (let* ((completion-ignore-case t)
           (read-file-name-completion-ignore-case t)
           (filename
            (when (memq major-mode '(dired-mode locate-mode))
              (dired-get-filename 'local 'no-error)))
           (dir (emacspeak-media-guess-directory))
           (collection
            (or
             filename                   ; short-circuit expensive call
             (if prefix
                 (ems--subdirs-recursively  dir) ;list dirs
               (directory-files-recursively dir emacspeak-media-extensions)))))
      (or filename (completing-read "Media: "  collection))))))

5.3. Helper: Recursive List Of Sub-directories

  ;;; Helpers: subdirs


(defconst ems--subdirs-filter
  (eval-when-compile
    (concat (regexp-opt '("/.." "/." "/.git")) "$"))
  "Pattern to filter out dirs during traversal.")

(defsubst ems--subdirs (d)
  "Return list of subdirs in directory d"
  (cl-remove-if-not #'file-directory-p (cddr (directory-files d 'full))))

(defun ems--subdirs-recursively (d)
  "Recursive list of  subdirs"
  (cl-declare (special ems--subdirs-filter))
  (let ((result (list d))
        (subdirs (ems--subdirs d)))
    (cond
     ((string-match ems--subdirs-filter d) nil)                              ; pass
     (t
      (cl-loop
       for dir in subdirs
       if (not (string-match ems--subdirs-filter dir)) do
       (setq result  (nconc result (ems--subdirs-recursively dir))))))
    result))