Thursday, December 31, 2009

Chomping in Emacs

While working on Ezbl, I came to the terrifying realization that Emacs doesn't have a chomp-like function to strip leading and trailing whitespace from a string. After some searching, I found a solution, but it was kind of ugly (specifying the whitespace characters exactly rather than using a character class), so I modified it a bit. Here is my result:
(defun chomp (str)
  "Chomp leading and tailing whitespace from STR.

Why doesn't Emacs have this built in?"
  (let ((s (if (symbolp str) (symbol-name str) str)))
    (save-excursion
      ;; Make the [:space:] class match newline.
      (with-syntax-table (copy-syntax-table)
        (modify-syntax-entry ?\n " ")
        (string-match "^[[:space:]]*\\(.*?\\)[[:space:]]*$" s)
        (match-string 1 s)))))
The magic is all in the regular expression, which eats up as much whitespace as possible from the beginning and end and returns whatever is left in between (because of the non-greedy "*?" operator). By default (or in the mode I was using), the newline character is not considered part of the whitespace class, so I add it to a temporary syntax table. Any other characters which should be considered whitespace could be added in the same way. Maybe this can be included in a future version of Emacs, since it is useful and not too complex.

Update: So about 3 minutes after feeling all smart and cool for posting this, I made a comment on the #emacs IRC channel and immediately got a response back pointing me to replace-regexp-in-string. That whole big (ish) function collapses down to

(replace-regexp-in-string "\\(^[[:space:]\\n]*\\|[[:space:]\\n]*$\\)" "" str)

So yeah, quite possibly too short to warrant its own function. Serves me right for being so high and mighty with my fancy syntax-table. Silly mortal, Emacs always knows better.