Extracting A Light-Weight Key-Sequence Reader By Progressive Simplification
1 Background
In the previous article entitled On Defining Keys In Emacs I covered
the issue of declaring key-sequences when defining keyboard
short-cuts. During the last week, I took the underlying Emacs Lisp
function edmacro-parse-key and via a process of progressive
simplification derived a new new-kbd function that is much simpler
and consequently easier to understand. You can see the step-by-step
simplification via the Git history for file new-kbd.el (note: used to
be called ems-kbd.el). That file contains the final version of the
simplified function, along with a test-suite that verifies that it's
behavior is convenient with the solution built into Emacs. The updated
function is now part of Emacspeak and is named ems-kbd in that
package.
The next section gives a high-level overview of the steps that led to the final version.
2 Steps Toward Simplification
2.1 Separate Tokenization From Processing
Function edmacro-parse-keys
interweaves the process of tokenizing its input string and how
various parts of that string are processed in a single while loop.
The first step in simplification was to separate these steps, by
using function split-string to split the input string on
whitespace to generate a list of words.
A simple cl-loop is then used to turn each word into a key that
is accumulated into a result vector.
2.2 Refactoring Case Analysis
Once tokenization is factored out,
the remainder of function edmacro-parse-keys
converts each key-specification into either the corresponding string
or vector representation.
The original requirement of parsing the serialization of keyboard-macros brought along additional logic that I first eliminated, since my goal was to create a function to be used in defining keyboard-shortcuts.
- I eliminated code that handled invocation of
M-x execute-extended-commandduring a keyboard-macro. - I eliminated processing of comments within the keyboard-macro serialization.
2.3 Rearranging Conditionals
Next, I rearranged conditionals and in that process eliminated cond
clauses that were now effectively dead-code.
In the process, I also eliminated test-predicates that had side-effects to hopefully result in less fragile code.
2.4 Lexically Bind Regex Patterns
To improve readability, I created let-bindings to some of the regex patterns used to identify key-sequence patterns. In the process, I also made these more readable by using [:space:] for white-space tests.
2.5 Always Return A Vector
Finally, I setup the new function to always return a vector; function
edmacro-parse-keys returns either a string or a vector based on how
it is called.
Since Emacs now takes a vector in every context where a key-sequence
is expected, this simplification does not break when using our
simplified function for defining keys.
(defun new-kbd (string )
"Simplified and hopefully more robust kbd function.
Always returns a vector i.e. like passing need-vector to edmacro-parse-keys. "
(let ((res [])
(special-char-reg "^\\(NUL\\|RET\\|LFD\\|ESC\\|SPC\\|DEL\\)$")
(modifier+angle-reg "^\\(\\([ACHMsS]-\\)*\\)<\\(.+\\)>$"))
(cl-loop
for word in (split-string string)
do
(let* ((key nil))
(cond
((and ;;; modifier+-<key> without DEL etc
(not (string-match special-char-reg word))
(string-match modifier+angle-reg word))
(setq key
(list
(intern
(concat ;;; strip < and >
(substring word (match-beginning 1) (match-end 1))
(substring word (match-beginning 3) (match-end 3)))))))
(t
(let ((prefix 0)
(bits 0))
(while ;;; calculate modifier bits
(string-match "^[ACHMsS]-." word)
(cl-incf bits
(cdr
(assq (aref word 0)
'((?A . ?\A-\^@)
(?C . ?\C-\^@)
(?H . ?\H-\^@)
(?M . ?\M-\^@)
(?s . ?\s-\^@)
(?S . ?\S-\^@)))))
(cl-incf prefix 2)
(cl-callf substring word 2))
(when (string-match "^\\^.$" word)
(cl-incf bits ?\C-\^@)
(cl-incf prefix)
(cl-callf substring word 1))
(when-let
(found
(assoc word
'(("NUL" . "\0")
("RET" . "\r")
("LFD" . "\n")
("TAB" . "\t")
("ESC" . "\e")
("SPC" . " ")
("DEL" . "\177"))))
(setq word (cdr found)))
(cond ;;; apply modifiers
((= bits 0) (setq key word))
((/= (length word) 1)
(error "%s: Prefix must precede a single character, not %s"
string word))
((and
(/= (logand bits ?\C-\^@) 0)
(string-match "[@-_a-z]" word))
(setq key
(list (+ bits (- ?\C-\^@)
(logand (aref word 0) 31)))))
(t (setq key (list (+ bits (aref word 0)))))))))
;;; push key on to the result vector
(when key (cl-callf vconcat res key))))
res))
You can verify the code above by running the tests found at the end of
file new-kbd.el — the tests were extracted from the various
patterns described in the Elisp Reference, as well as by reading the
code in edmacro-parse-keys.
2.6 Closing Thoughts
The above simplification exercise was done by:
- Starting with the original
edmacro-parse-keyscopied over to a new file and renamed to functionnew-kbd. - Adding a set of tests at the end of file, essentially this is a
letthat binds a set of tests, then compares the result of calling our new function on each value with that returned by the original. - Modifying and simplifying our new function and running
eval-bufferafter each step. - It was a fun exercise to see order emerge from chaos at each step!