hygiene-versus-gensym


Gensym does not provide all the protection that hygiene does. This brief explanation shows why.

Take the example of a swap! macro, which should be defined as:

 (SWAP! x y) 
 ; --expands--to--> 
 (LET ((VALUE x)) 
   (SET! x y) 
   (SET! y VALUE)) 

Suppose we wrote a very naive definition, using a Common Lisp-style defmacro:

 (defmacro naive-swap! (var1 var2) 
   `(LET ((VALUE ,var1)) 
      (SET! ,var1 ,var2) 
      (SET! ,var2 VALUE))) 

Any Common Lisp programmer will tell you that that is a very vulnerable macro; if you were to try either (NAIVE-SWAP! VALUE something) or (NAIVE-SWAP! something VALUE), value would be inadvertently captured, and the macro would break. That same Common Lisp programmer would probably tell you to rewrite it as:

 (defmacro cl-swap! (var1 var2) 
   (let ((value-symbol (gensym))) 
     `(LET ((,value-symbol ,var1)) 
        (SET! ,var1 ,var2) 
        (SET! ,var2 ,value-symbol)))) 

This new definition is greatly improved: value is not captured at all; the example that broke the old definition will now work perfectly fine. Indeed, to many, it would seem that cl-swap! is good enough as it is and there are no more problems with it. But unfortunately that idea is wrong. Suppose someone wrote something like this:

 (let ((set! display)) 
   (cl-swap! a b)) 

Cl-swap! would gleefully cause this to expand to

 (let ((set! display)) 
   (let ((value a)) 
     (set! a b) 
     (set! b value))) 

Performing a simple beta-substitution, we get:

 (let ((value a)) 
   (display a b) 
   (display b value)) 

Oops. That's not quite what swap!, in any variation, should do at all.

Now, this may not seem particularly relevant, because, really, how often do you locally shadow set!? However, think of this problem in the case of other situations; you have the same sorts of problems with dynamic scoping. Indeed, imagine that you wrote a code snippet like this, in a language that looks like Scheme but is dynamically scoped:

 ;;; Module A: 
 (define (f ...) 
   ...(let ((v ...)) 
        ...(g ...(lambda ...v...)...) 
        ...) 
   ...) 
  
 ;;; Module B: 
  
 (define (g ...h...) 
   ...(let ((v <some local value>)) 
        ...(h ...)...)) 

Whoops. F's h, the (lambda ...v...), gets g's local v. The author of module B didn't even know about f's v, as it's a hidden part of the implementation of module A. Code breaks unexpectedly just as it would with dynamic scoping. This is exactly the same sort of problem. Now, you could solve it by using unnecessarily verbose identifiers, which gets to be a pain. Hygiene already provides you with the solution, though: let & set! from the original swap! example just get renamed by the macro system; they now refer exactly to the let & set! that the built-in scheme environment provides; no matter what else is called let & set! in the context of the use of swap!, those three uses of the two special forms (let & set!) will always refer to the right binding.

Here is the definition of hygienic-swap!, which hygienically renames all of the identifiers it introduces uses of, in the explicit renaming macro system -- described in [1] --, which we use to make explicit where hygiene 'happens:'

 (define-syntax hygienic-swap! 
   (transformer ; Drop this for Scheme48. 
     (lambda (form rename compare) 
       (destructure ; Open the DESTRUCTURING structure first, of course. 
           (((swap! var1 var2) form)) 
         `(,(rename 'LET) ; Be sure to rename LET so we get SCHEME's LET. 
              ((,(rename 'VALUE) ; Rename VALUE so we don't inadvertently 
                ,var1))          ; and unhygienically capture it. 
            (,(rename 'SET!) ; Rename SET!, too, to get SCHEME's SET!. 
             ,var1 
             ,var2) 
            (,(rename 'SET!) ,var2 
              ,(rename 'VALUE) ; It's safe to apply RENAME to the symbol 
              ))))))           ; VALUE more than once, because RENAME is 
                               ; always referentially transparent. 

Now, let us observe a sample output. name~colour identifies name with a unique colour colour:

 (hygienic-swap! value something) 
 ; --expands--to--> 
 (let ((value~0 value~1)) 
   (set! value~1 something~0) 
   (set! something~0 value~0)) 

It looks like we fixed the problem naive-swap! had with inadvertently capturing value; let's try it with the problem cl-swap! has. In this output, rather than using colour, the environment that a variable reference (rather than an ordinary symbol) refers to is made explicit; environment##name here refers to the binding of name in environment:

 (let ((set! display)) 
   (hygienic-swap! a b)) 
 ; --expands--to--> 
 (scheme##let ((set! scheme##display)) 
   (scheme##let ((value a)) 
     (scheme##set! a b) 
     (scheme##set! b value))) 

Now the user of hygienic-swap! locally binds set!, not scheme##set!; thus our macro hygienic-swap! doesn't inadvertently use the set! that the user of the macro introduced.

There is also another problem with inserting symbols directly; this example implementation of force & delay demonstrates it quite well:

 (defmacro cl-delay (expression) 
   `(MAKE-PROMISE (LAMBDA () ,expression))) 
  
 (define (force promise) (promise)) 
  
 (define (make-promise thunk) ...) 

The problem is this: make-promise is not exported from any module; the only interface to lazy evaluation is force & delay. So when you use cl-delay, it expands to a use of make-promise, but make-promise is unbound. Again, hygiene saves you here:

 (define-syntax hygienic-delay 
   (transformer 
     (lambda (form rename compare) 
       `(,(rename 'MAKE-PROMISE) ; Rename MAKE-PROMISE so we get a binding 
                                 ; to the MAKE-PROMISE that our module for 
                                 ; lazy evaluation does not export. 
         (,(rename 'LAMBDA) ; Rename LAMBDA, too, just in case. (You never 
                            ; know what those whackos might do to LAMBDA!) 
             () 
           ,(cadr form)))))) 

Conclusion: gensym is not a suitable replacement for real hygiene (at least in a Lisp-1, Lisp-2 and a package system make gensym a suitable replacement). Common Lisp's package system solves the problem of cl-delay, though the merit of that solution is disputable, and I don't want to get into debates about module systems between Common Lisp and Scheme (it's bad enough when only Scheme is involved!).

-> While it is true to some extent, the problem of hygiene in Common Lisp not nonexistant, but it is at worst a minor irritant in the experience of most Common Lisp programmers. Some have argued that this is due to the fact that Common Lisp is a Lisp-2, and thus doesn't have as many problems with function and variable name clashes. This line of reasoning has lead many to conclude that hygiene is just not as important in a Lisp-2 as in a Lisp-1 (let alone the above hypothetical dynamically-scoped Lisp-1).

Comments

[I don't believe this is correct. This is where CL's dual namespace comes into play. The CL equivalent of set! is setf which is a macro that expands to the function setq. Since setq is a function, which is in the function namespace, all you will get is a undefined function error. We're not talking about CL, though. We're talking about Scheme. See the comments at bottom of the page about Lisp-1 vs Lisp-2. Also, in the future, to preserve the flow of the article for casual readers, please place comments at the bottom of the page if you wish to place them at all. This will be deleted or relocated shortly.]

In my opinion this wiki page is highly misleading. A "Common Lisp style defmacro" includes the second namespace and (according to CLtL2 and ANSI CL) the fact that you are not allowed to re-bind functions, macros, or special forms specified by CL. These two properties make all the examples presented on this wiki page irrelevant for "Common Lisp style defmacros". In my opinion it should be made clear at the top of the wiki page that the entire article is about working around the problems posed to macro writers by lisp-1s like Scheme. Properly written Common Lisp code uses gensyms to provide all the protection provided by "hygienic" macro systems. I do, however, concede to "hygienic" macro fans that it is a lot harder to shoot yourself in the foot when your hands are tied behind your back. -Doug Hoyte, http://hcsw.org

Doug, exactly! In CL your hands are tied behind your back because you are not allowed to re-bind functions, macros or special forms specified by CL. In Scheme you are free to rebind anything you want, and thanks to hygiene can compose any such macros without fear of things blowing up on you. - ashinn

Ashinn - Actually no. CL does allow you to rebind anything within your own package and the package system/macro handling will make sure that cross package macro expansions work without any name collisions. syntax-case and friends are a wonderful system, there is no need to prop them up by using a Lisp-2 strawman. -- Chris Dean

The fact that CL has to go out of the way to forbid rebinding the standard functions/macros/special forms is actually a big kludge that at first seems to support the case for hygienic macros. The real kludge here lies a little deeper.

There is nothing a priori wrong with rebinding "lambda" (see my blog post about one plausible use case: http://carcaddar.blogspot.com/2009/04/closure-oriented-metaprogramming-via.html). If you want to maintain referential transparency for the *expanded code*, rebinding "lambda" in the expansion of your macros is exactly what you want. Hygiene is a synonym for maintaining referential transparency for the *definition of the expanded code*.

The kludge is that Common Lisp disallows you from doing that rebinding for the standard functions/macros/special forms, which forces the language to be earlier-bound than it needs to be (the early-binding sanity and efficiency could be gained using the liberties granted to compiler macros and inline declaration optimizations by the standard). Lisp-1 vs Lisp-2 is completely irrelevant to this issue.

What does this imply for hygienic macros? Hygiene effectively forces early binding on all code produced by macros. This eliminates a powerful class of metaprogramming techniques. Doug Hoyte has written an entire book (Let Over Lambda - http://letoverlambda.com/) on the subject, which he calls "duality of syntax."

--Vladimir Sedach

After having thought about the above for a while, I realize that I'm really making a case for dynamically-scoped functions, and not against hygienic macros. It seems that there are good use cases for dynamic capture, but I haven't been able to think of a good use case for lexical capture. Even in Hoyte's book, most of the intentional use of lexical capture centers around macros having free variables in their expansions (with the expectation that those macros would be expanded inside other macros that would provide bindings for those free variables).

--Vladimir Sedach

As someone who (happily!) made the switch from CL to Scheme six months ago, I hope I can make the following comments without raising suspicions that I am a "Common Lisp apologist." I really liked the first half of this article. Macros were my original motivation for learning CL a few years ago, so I was happy to find a cogent explanation of the rationale behind Scheme's macro facilities.

That said, I have never felt more dubious about my decision to switch to Scheme than I did when I got my first look at the "improved" hygeinic-swap! macro. It 's three times as long as its CL equivalent, at least three times as hard to read, and justified as the solution to a problem caused by incomprehensibly stupid user behavior. Don't get me wrong -- I'm perfectly willing to believe that define-macro has all kinds of problems associated with it. But the complete lack of acknowledgment that the hygienic macro LOOKS LIKE an enormous lose, and the absence of any suggestion that it could be made shorter or easier to read makes me remember all the people who have suggested to me that Scheme is a language for hair-splitting pedants who have no regard for getting anything done in the real world. I know that's not true, but I don't know enough about defining syntax to fix this wiki entry, so I'm whining about it here instead. --Incredulous.

The hygienic-swap! macro could be rewritten in a form that is only somewhat longer than the define-macro version:

  (define-syntax hygienic-swap! 
    (syntax-rules () 
     ((_ a b) 
      (let ((temp a)) 
         (set! a b) 
         (set! b temp))))) 

The "syntax-rules" macro is the most portable syntax-expansion form (and it's also less powerful than CL macros, as it leaves some things impossible). There are some implementations (like MIT Scheme) that don't even support other transformers, and there are implementations like PLT Scheme where unhygienic macros are there, but they don't work correctly.

Citations

[1] William D. Clinger. Hygienic macros through explicit renaming. Lisp Pointers IV(4), 25-28, December 1991. http://citeseer.ist.psu.edu/cache/papers/cs/1860/ftp:zSzzSzftp.cs.indiana.eduzSzpubzSzscheme-repositoryzSzdoczSzpropzSzexrename.pdf/clinger91hygienic.pdf


category-syntax category-macro