Macros offer a way to extend the Scheme language by introducing new forms of syntax that are transformed into the simpler base language. Since Scheme's syntax is defined in terms of Scheme objects -- lists & symbols, mainly --, one would imagine that procedures that perform such transformations, known as macro transformers, should be easy: they just operate on lists & symbols, especially in the presence of quasiquote. However, this is not quite the case: macros need to also be aware of the underlying lexical scoping of the programs they transform. This article will not go into the need for that; see my article hygiene-versus-gensym for discussion of that matter. Macros need to work with more than S-expressions: they need S-expressions annotated with lexical scoping information.
The code examples in this article will run in MIT Scheme, in the version 7.7 series. Note, though, that non-hygienic-macro-transformer is now deprecated and may cease to exist in future releases, but it is useful for the sake of demonstration here. Also note that the procedure passed to it is given as arguments all of the subforms, while the procedure passed to sc-macro-transformer, which constructs a macro transformer that uses syntactic closures, will be given the entire form, including the keyword itself at the head, as the first argument.
Scheme closures are a well-understood concept. Lambda expressions, when evaluated, close over or save the current lexical environment and store it in the procedures they evaluate to. This concept can in fact be easily extended to the syntactic level: expressions can be closed over a syntactic environment. Syntactic environments map names to their denotations. A name's denotation can be a special form (for core syntax, like lambda or if), a macro binding, or a variable binding.
For instance, there is a syntactic environment for standard Scheme. It maps names like lambda, if, &c., to special forms; cond, delay, &c., to macro transformers; and +, cons, &c., to simple variable denotations. These are all top-level bindings; lambda & let-syntax create local bindings. For instance, in the expression (lambda (x) x) as considered with the Scheme syntactic environment, where the name lambda is mapped to the standard special form, there is a variable denotation for the name x local to the expression. Within the lambda, the name x will be mapped by the extended syntactic environment that lambda creates to the new variable binding.
Note that each denotation is exact. In the code snippet (lambda (x) (f x (lambda (x) (g x y)))), there are two bindings each named x. The outer lambda binds one variable; the inner, another. Depending on context, the name x may mean either of the two variable bindings: two different lexical environments will map the name x to different denotations. Syntactic environments work similarly: in syntactic environment established inside the outer lambda, x is mapped to a variable denotation for the outer lambda's binding. In the inner lambda, the syntactic environment maps the name x to a variable denotation as bound by the inner lambda. Note, of course, that syntactic environments map names to abstract denotations, not values. This is because they are syntactic attributes; values are at run-time.
Alan Bawden & Jonathan Rees in 1988 proposed a macro system for Scheme called syntactic closures. Quite simply, syntactic closures are expressions annotated with syntactic environments. The meanings of names within the expression of a syntactic closure are determined by its syntactic environment. Consider a macro (swap! a b), for example, that swaps the values of the variables a & b. It should expand to something like
(let ((value <a>)) (set! <a> <b>) (set! <b> value))
Clearly, a simplistic macro transformer such as
(define-syntax swap! (non-hygienic-macro-transformer (lambda form (let ((a (car form)) (b (cadr form))) `(LET ((VALUE ,a)) (SET! ,a ,b) (SET! ,b VALUE))))))
is fragile & susceptible to name conflicts usual to non-hygienic macro transformers. The solution is simple: we want the expressions from the input form to be considered in their own environment and let, value, & set!, in the environment of the transformer. (The input expression may be in a context in which let & set! are bound to something other than the denotations the swap! transformer expects.) So we write a transformer procedure that carefully closes all the input subforms in their own environments, but everything else in the environment of the transformer. Since the output of a macro is automatically considered in the environment in which it was defined, we need explicitly close only the input forms, which we do with the make-syntactic-closure procedure. Ignore its second parameter for now; we shall discuss it later.
(define-syntax swap! (sc-macro-transformer (lambda (form environment) (let ((a (make-syntactic-closure environment '() (cadr form))) (b (make-syntactic-closure environment '() (caddr form)))) `(LET ((VALUE ,a)) (SET! ,a ,b) (SET! ,b VALUE))))))
Here, in the expression b, value is not bound to the same denotation as was created with the let that swap! expanded to, because it is closed in a completely different syntactic environment, so the expression b in the first assignment cannot inadvertently refer to the value we bound with let. Likewise, even if the name of a is value, its denotation is entirely different from what we we bound with let above, so the first set! can never inadvertently assign the value we bound with let.
There are also times, however, when we want to avoid hygiene -- that is, when the input expressions should be able to refer to bindings created implicitly by a macro's expansion. For instance, consider a (loop body) macro that repeats the body endlessly, but binds a variable named exit to a procedure that aborts the loop and returns values from the loop expression. With an unhygienic transformer, like above, it might be written thus:
(define-syntax loop (non-hygienic-macro-transformer (lambda (form) `(CALL-WITH-CURRENT-CONTINUATION (LAMBDA (EXIT) ; Note this binding of EXIT. (LET LOOP () ,form (LOOP)))))))
There are two problems with this. The first is that the context of the invocation of loop may bind call-with-current-continuation, lambda, or let, to something unexpected by the macro transformer, so the output may have strange semantics. The other is that the body is evaluated in an environment in which loop is bound to a procedure that repeats the loop, not the loop macro, so nested loops are disallowed! These are both very basic abstraction problems.
We should like to write a hygienic transformer using syntactic closures that solves both of these problems. Here is how one might initially attempt to tackle the problem:
(define-syntax loop (sc-macro-transformer (lambda (form environment) `(CALL-WITH-CURRENT-CONTINUATION (LAMBDA (EXIT) (LET LOOP () ,(make-syntactic-closure environment '() (cadr form)) (LOOP)))))))
This, however, does not work as expected: exit is not bound in the body to the denotation we wanted, namely the binding created by the lambda that was an argument to call-with-current-continuation. (Indeed, it may be bound to a procedure that exits the whole Scheme system!) What we want to do here is close the body in the environment of its usage, but also allow one identifier, exit, to be interpreted in the enclosing environment. That is, we want to let one free name into the closure, and this is what make-syntactic-closure's second parameter is for: it is a list of the free names in the syntactic closure.
Procedural closures not only save their lexical environment, but they also accept parameters; they are not completely closed: they accept inputs from their sites of usage. Syntactic closures work similarly. They are closed over a syntactic environment, but they may also accept bindings from the context in which they are used. The only difference, really, is the exact mechanism by which the two different kinds of closures accept parameters: procedural closures accept them by being called with a sequence of arguments, whereas syntactic closures accept them from the surrounding context.
The loop macro transformer can be fixed by simply adding exit to the list of names that should be free in the body. The binding for exit will then be sought in the surrounding environment, where exit was bound by the lambda that was the second argument in the call to call-with-current-continuation. The fixed macro transformer is:
(define-syntax loop (sc-macro-transformer (lambda (form environment) `(CALL-WITH-CURRENT-CONTINUATION (LAMBDA (EXIT) (LET LOOP () ,(make-syntactic-closure environment '(EXIT) (cadr form)) (LOOP)))))))
Note that the only change was to allow exit to be free in the body so that the syntactic environment of the body would map it to the denotation as created by our lambda. Loop is still hidden from the body, for example, as desired, so this macro nests properly as well.
Another, more sophisticated example of syntactic closures' free names is an 'anaphoric if' macro. This anaphoric if, or aif, is just like if, except that, if the condition evaluates to a true value, its value is saved and the variable it is bound in the consequent to its value. The name it is unbound in the alternative clause, however. Here is an example of aif:
(aif (assv 'a '((1 . 2) (a . b))) (cdr it) 'nothing) ; => B
The macro is easy to write. It expands to an expression of the form (let ((it condition)) (if it consequent alternate)), all closed in the transformer environment, as usual, except for the input forms. The condition needs to close over the simple usage environment, and likewise with the alternate. The consequent needs to close over the usage environment as well, and it also needs it as a free name, but note that it should not be bound in the alternative clause. Thus:
(define-syntax aif (sc-macro-transformer (lambda (form environment) (let ((condition (make-syntactic-closure environment '() (cadr form))) (consequent (make-syntactic-closure environment '(IT) (caddr form))) (alternative (make-syntactic-closure environment '() (cadddr form)))) `(LET ((IT ,condition)) (IF IT ,consequent ,alternative))))))
Notice how easy it is to control the exact scoping of the identifier it. Only in the consequent clause is it free; it is not allowed to escape into the alternative clause. With a naive non-hygienic expansion, it would be inadvertently captured in the alternative, but we wish to avoid that, so a non-hygienic expansion would need go to the trouble of wrapping the alternative clause in a thunk in order to preserve its scope. Syntactic closures are a very simple way to have exact control over syntactic preservation of lexical scoping in Scheme macros.