Four two arm spirals, warped in hyperbolic space.

Chapter 8. Syntactic Extension

Syntactic extensions, or macros, are used to simplify and regularize repeated patterns in a program, to introduce syntactic forms with new evaluation rules, and to perform transformations that help make programs more efficient.

A syntactic extension typically takes the form (keyword subform ...), where keyword is the identifier that names the syntactic extension. The syntax of each subform varies from one syntactic extension to another. Syntactic extensions can also take the form of improper lists (or even singleton identifiers; see Section 8.3), although this is less common.

New syntactic extensions are defined by associating keywords with transformation procedures, or transformers. Syntactic extensions are defined globally using top-level define-syntax forms or within the scope of particular expressions using let-syntax, letrec-syntax, and internal define-syntax. Transformers are created using syntax-rules or syntax-case, which allow transformations to be specified via pattern matching.

Syntactic extensions are expanded into core forms at the start of evaluation (before compilation or interpretation) by a syntax expander. The expander is invoked once for each top-level form in a program. If the expander encounters a syntactic extension, it invokes the associated transformer to expand the syntactic extension, then repeats the expansion process for the form returned by the transformer. If the expander encounters a core syntactic form, it recursively processes the subforms, if any, and reconstructs the form from the expanded subforms. Information about identifier bindings is maintained during expansion to enforce lexical scoping for variables and keywords.

The syntactic extension mechanisms described in this chapter are part of the "syntax-case" system that has become a de facto standard in the absence of a standard full-featured syntactic extension system. A portable implementation of the syntax-case system is available at http://www.scheme.com/syntax-case/. The syntax-case system also supports modules and several other features that are described in the Chez Scheme User's Guide [5]. A description of the motivations behind and implementation of the system can be found in the articles "Syntactic Abstraction in Scheme" [7] and "Extending the Scope of Syntactic Abstraction" [22].

The Revised⁵ Report includes only let-syntax, letrec-syntax, top-level define-syntax, and syntax-rules. The pattern language supported by the Revised⁵ Report version of syntax-rules is also more limited, with pattern ellipses allowed only at the end of list- or vector-structured patterns. (See page 187.) Furthermore, the bodies of let-syntax and letrec-syntax are treated like lambda bodies, i.e., they open up new scopes, which prevents them from being used in contexts where definitions are required. (See page 185.) Programmers desiring to write programs that are guaranteed to run to all Revised⁵ Report implementations should stick with the Revised⁵ Report subset and use let-syntax and letrec-syntax in a manner consistent with either interpretation. The ANSI/IEEE standard does not include any syntactic extension mechanism.

Section 8.1. Keyword Bindings

This section describes forms that establish bindings between keywords and transformers. Keyword bindings may be established at top level, using define-syntax, or locally, using let-syntax, letrec-syntax, or internal define-syntax. Existing keyword bindings may be rebound temporarily with fluid-let-syntax.

syntax: (define-syntax keyword exp)
returns: unspecified

exp must evaluate to a transformer.

The following example defines let* as a syntactic extension, specifying the transformer with syntax-rules (see Section 8.2).

(define-syntax let* (syntax-rules () ((_ () e1 e2 ...) (let () e1 e2 ...)) ((_ ((i1 v1) (i2 v2) ...) e1 e2 ...) (let ((i1 v1)) (let* ((i2 v2) ...) e1 e2 ...)))))

define-syntax forms appearing at top level behave similarly to top-level variable definitions, and define-syntax forms appearing at the front of a lambda or other body behave similarly to internal variable definitions. That is, a binding established by a top-level define-syntax form is visible globally, whereas one established by an internal define-syntax form is visible only within the body in which the define-syntax form appears.

All bindings established by a set of internal definitions, whether keyword or variable definitions, are visible within the definitions themselves. For example, the expression

(let () (define even? (lambda (x) (or (= x 0) (odd? (- x 1))))) (define-syntax odd? (syntax-rules () ((_ x) (not (even? x))))) (even? 10))

is valid and should return #t.

The expander processes the initial forms in a lambda or other body from left to right. If it encounters a variable definition, it records the fact that the defined identifier is a variable but defers expansion of the right-hand-side expression until after all of the definitions have been processed. If it encounters a keyword definition, it expands and evaluates the right-hand-side expression and binds the keyword to the resulting transformer. If it encounters an expression, it fully expands all deferred right-hand-side expressions along with the current and remaining body expressions.

An implication of the left-to-right processing order is that one internal definition can affect whether a subsequent form is also a definition. For example, the expression

(let () (define-syntax bind-to-zero (syntax-rules () ((_ id) (define id 0)))) (bind-to-zero x) x)

evaluates to 0, regardless of any binding for bind-to-zero that might appear outside of the let expression.

A top-level syntactic definition must be established before its first use in order for that use to be recognized.

syntax: (let-syntax ((keyword exp) ...) form₁ form₂ ...)
syntax: (letrec-syntax ((keyword exp) ...) form₁ form₂ ...)
returns: see explanation

Each exp must evaluate to a transformer. For both let-syntax and letrec-syntax, each keyword is bound within the forms form₁ form₂ .... For letrec-syntax the binding scope also includes each exp.

A let-syntax or letrec-syntax form may expand into one or more expressions anywhere expressions are permitted, in which case the resulting expressions are treated as if enclosed in a begin expression. This allows a let-syntax or letrec-syntax form to expand into a definition or sequence of definitions anywhere definitions are permitted, in which case the definitions are treated as if they appeared in place of the let-syntax or letrec-syntax form. (This differs from the Revised⁵ Report treatment of these forms; see page 183.)

The following example highlights how let-syntax and letrec-syntax differ.

(let ((f (lambda (x) (+ x 1)))) (let-syntax ((f (syntax-rules () ((_ x) x))) (g (syntax-rules () ((_ x) (f x))))) (list (f 1) (g 1)))) (1 2) (let ((f (lambda (x) (+ x 1)))) (letrec-syntax ((f (syntax-rules () ((_ x) x))) (g (syntax-rules () ((_ x) (f x))))) (list (f 1) (g 1)))) (1 1)

The two expressions are identical except that the let-syntax form in the first expression is a letrec-syntax form in the second. In the first expression, the f occurring in g refers to the let-bound variable f, whereas in the second it refers to the keyword f whose binding is established by the letrec-syntax form.

syntax: (fluid-let-syntax ((keyword exp) ...) form₁ form₂ ...)
returns: see explanation

Each exp must evaluate to a transformer. fluid-let-syntax is similar to let-syntax, except that instead of introducing new bindings for the keywords keyword ..., fluid-let-syntax temporarily alters the existing bindings for the keywords during the expansion of its body. That is, during the expansion of form₁ form₂ ..., the visible lexical (or top-level) binding for each keyword is temporarily replaced by a new association between the keyword and the corresponding transformer. This affects any references to the keyword that resolve to the same lexical (or top-level) binding whether the references occur in the text of the body or are introduced during its expansion. In contrast, let-syntax captures only those references that occur within the text of its body.

The following example shows how fluid-let-syntax differs from let-syntax.

(let ((f (lambda (x) (+ x 1)))) (let-syntax ((g (syntax-rules () ((_ x) (f x))))) (let-syntax ((f (syntax-rules () ((_ x) x)))) (g 1)))) 2 (let ((f (lambda (x) (+ x 1)))) (let-syntax ((g (syntax-rules () ((_ x) (f x))))) (fluid-let-syntax ((f (syntax-rules () ((_ x) x)))) (g 1)))) 1

The two expressions are identical except that the inner let-syntax form in the first expression is a fluid-let-syntax form in the second. In the first expression, the f occurring in the expansion of (g 1) refers to the let-bound variable f, whereas in the second it refers to the keyword f by virtue of the fluid syntax binding for f.

Section 8.2. Syntax-Rules Transformers

The syntax-rules form described in this section permits simple transformers to be specified in a convenient manner. These transformers may be bound to keywords using the mechanisms described in Section 8.1. While it is much less expressive than the mechanism described in Section 8.3, it is sufficient for defining many common syntactic extensions.

syntax: (syntax-rules (literal ...) clause ...)
returns: a transformer

Each literal must be an identifier. Each clause takes the form:

(pattern template)

Each pattern specifies one possible syntax that the input form might take, and the corresponding template specifies how the output should appear in each case.

Patterns consist of list structure, vector structure, identifiers, and constants. Each identifier within a pattern is either a literal, a pattern variable, or an ellipsis. The identifier ... is an ellipsis. Any identifier other than ... is a literal if it appears in the list of literals (literal ...); otherwise, it is a pattern variable. Literals serve as auxiliary keywords, such as else in case and cond expressions. List and vector structure within a pattern specifies the basic structure required of the input, pattern variables specify arbitrary substructure, and literals and constants specify atomic pieces that must match exactly. Ellipses specify repeated occurrences of the subpatterns they follow.

An input form F matches a pattern P if and only if

P is a pattern variable,
P is a literal identifier and F is an identifier with the same binding (see free-identifier=? in Section 8.3),
P is of the form (P₁ ... P_n) and F is a list of n elements that match P₁ through P_n,
P is of the form (P₁ ... P_n . P_x) and F is a list or improper list of n or more elements whose first n elements match P₁ through P_n and whose nth cdr matches P_x,
P is of the form (P₁ ... P_k P_e ellipsis P_m+1 ... P_n), where ellipsis is the identifier ... and F is a proper list of n elements whose first k elements match P₁ through P_k, whose next m - k elements each match P_e, and whose remaining n - m elements match P_m+1 through P_n,
P is of the form (P₁ ... P_k P_e ellipsis P_m+1 ... P_n . P_x), where ellipsis is the identifier ... and F is a list or improper list of n elements whose first k elements match P₁ through P_k, whose next m - k elements each match P_e, whose next n - m elements match P_m+1 through P_n, and whose nth and final cdr matches P_x,
P is of the form #(P₁ ... P_n) and F is a vector of n elements that match P₁ through P_n,
P is of the form #(P₁ ... P_k P_e ellipsis P_m+1 ... P_n), where ellipsis is the identifier ... and F is a vector of n or more elements whose first k elements match P₁ through P_k, whose next m - k elements each match P_e, and whose remaining n - m elements match P_m+1 through P_n, or
P is a pattern datum (any nonlist, nonvector, nonsymbol object) and F is equal to P in the sense of the equal? procedure.

The outermost structure of a syntax-rules pattern must actually be in one of the list-structured forms above, although subpatterns of the pattern may be in any of the above forms. Furthermore, the first element of the outermost pattern is ignored, since it is always assumed to be the keyword naming the syntactic form. (These statements do not apply to syntax-case; see Section 8.3.)

If an input form passed to a syntax-rules transformer matches the pattern for a given clause, the clause is accepted and the form is transformed as specified by the associated template. As this transformation takes place, pattern variables appearing in the pattern are bound to the corresponding input subforms. Pattern variables appearing within a subpattern followed by one or more ellipses may be bound to a set or sets of zero or more input subforms.

A template is a pattern variable, an identifier that is not a pattern variable, a pattern datum, a list of subtemplates (S₁ ... S_n), an improper list of subtemplates (S₁ S₂ ... S_n . T), or a vector of subtemplates #(S₁ ... S_n). Each subtemplate S_i is either a template or a template followed by one or more ellipses. The final element T of an improper subtemplate list is a template.

Pattern variables appearing within a template are replaced in the output by the input subforms to which they are bound. Pattern data and identifiers that are not pattern variables are inserted directly into the output. List and vector structure within the template remains list and vector structure in the output. A subtemplate followed by an ellipsis expands into zero or more occurrences of the subtemplate. The subtemplate must contain at least one pattern variable from a subpattern followed by an ellipsis. (Otherwise, the expander could not determine how many times the subform should be repeated in the output.) Pattern variables that occur in subpatterns followed by one or more ellipses may occur only in subtemplates that are followed by (at least) as many ellipses. These pattern variables are replaced in the output by the input subforms to which they are bound, distributed as specified. If a pattern variable is followed by more ellipses in the template than in the associated pattern, the input form is replicated as necessary.

A template of the form (... template) is identical to template, except that ellipses within the template have no special meaning. That is, any ellipses contained within template are treated as ordinary identifiers. In particular, the template (... ...) produces a single ellipsis, .... This allows syntactic extensions to expand into forms containing ellipses.

The definition of or below demonstrates the use of syntax-rules.

(define-syntax or (syntax-rules () ((_) #f) ((_ e) e) ((_ e1 e2 e3 ...) (let ((t e1)) (if t t (or e2 e3 ...))))))

The input patterns specify that the input must consist of the keyword and zero or more subexpressions. An underscore ( _ ), which is an ordinary pattern variable, is used by convention for the keyword position to remind the programmer and anyone reading the definition that the keyword position never fails to contain the expected keyword and need not be matched. (In fact, as mentioned above, syntax-rules ignores what appears in the keyword position.) If more than one subexpression is present (third clause), the expanded code both tests the value of the first subexpression and returns the value if it is not false. To avoid evaluating the expression twice, the transformer introduces a binding for the temporary variable t.

The expansion algorithm maintains lexical scoping automatically by renaming local identifiers as necessary. Thus, the binding for t introduced by the transformer is visible only within code introduced by the transformer and not within subforms of the input. Similarly, the references to the identifiers let and if are unaffected by any bindings present in the context of the input.

(let ((if #f)) (let ((t 'okay)) (or if t))) okay

This expression is transformed during expansion to the equivalent of the expression below.

((lambda (if1) ((lambda (t1) ((lambda (t2) (if t2 t2 t1)) if1)) 'okay)) #f) okay

In this sample expansion, if1, t1, and t2 represent identifiers to which if and t in the original expression and t in the expansion of or have been renamed.

The definition of a simplified version of cond below (simplified because it requires at least one output expression per clause and does not support the auxiliary keyword =>) demonstrates how auxiliary keywords such as else are recognized in the input to a transformer, via inclusion in the list of literals.

(define-syntax cond (syntax-rules (else) ((_ (else e1 e2 ...)) (begin e1 e2 ...)) ((_ (e0 e1 e2 ...)) (if e0 (begin e1 e2 ...))) ((_ (e0 e1 e2 ...) c1 c2 ...) (if e0 (begin e1 e2 ...) (cond c1 c2 ...)))))

Section 8.3. Syntax-Case Transformers

This section describes a more expressive mechanism for creating transformers, based on syntax-case, a generalized version of syntax-rules. This mechanism permits more complex transformations to be specified, including transformations that "bend" lexical scoping in a controlled manner, allowing a much broader class of syntactic extensions to be defined. Any transformer that may be defined using syntax-rules may be rewritten easily to use syntax-case instead; in fact, syntax-rules itself may be defined as a syntactic extension in terms of syntax-case, as demonstrated within the description of syntax below.

With this mechanism, transformers are procedures of one argument. The argument is a syntax object representing the form to be processed. The return value is a syntax object representing the output form. A syntax object contains contextual information about a form in addition to its structure. This contextual information is used by the expander to maintain lexical scoping.

A syntax object representing an identifier is itself referred to as an identifier; thus, the term identifier may refer either to the syntactic entity (symbol, variable, or keyword) or to the concrete representation of the syntactic entity as a syntax object. It is rarely necessary to distinguish the two uses.

Transformers destructure their input with syntax-case and rebuild their output with syntax. These two forms alone are sufficient for defining many syntactic extensions, including any that can be defined using syntax-rules. They are described below along with a set of additional forms and procedures that provide added functionality.

syntax: (syntax-case exp (literal ...) clause ...)
returns: see below

Each literal must be an identifier. Each clause must take one of the following two forms.

(pattern output-expression) (pattern fender output-expression)

syntax-case patterns may be in any of the forms described in Section 8.2.

syntax-case first evaluates exp, then attempts to match the resulting value against the pattern from the first clause. This value is usually a syntax object, but it may be any Scheme object. If the value matches the pattern and no fender is present, output-expression is evaluated and its value returned as the value of the syntax-case expression. If the value does not match the pattern, the value is compared against the next clause, and so on. An error is signaled if the value does not match any of the patterns.

If the optional fender is present, it serves as an additional constraint on acceptance of a clause. If the value of the syntax-case exp matches the pattern for a given clause, the corresponding fender is evaluated. If fender evaluates to a true value, the clause is accepted; otherwise, the clause is rejected as if the input had failed to match the pattern. Fenders are logically a part of the matching process, i.e., they specify additional matching constraints beyond the basic structure of an expression.

Pattern variables contained within a clause's pattern are bound to the corresponding pieces of the input value within the clause's fender (if present) and output-expression. Pattern variables occupy the same name space as program variables and keywords; pattern variable bindings created by syntax-case can shadow (and be shadowed by) program variable and keyword bindings as well as other pattern variable bindings. Pattern variables, however, can be referenced only within syntax expressions.

See the examples following the description of syntax.

syntax: (syntax template)
returns: see below

A syntax expression is like a quote expression except that the values of pattern variables appearing within template are inserted into template, and contextual information associated both with the input and with the template is retained in the output to support lexical scoping. List and vector structures within the template become true lists or vectors (suitable for direct application of list or vector operations, like map or vector-ref) to the extent that the list or vector structures must be copied to insert the values of pattern variables. A syntax template is identical to a syntax-rules template and is treated similarly.

The definition of or below is equivalent to the one given in Section 8.2 except that it employs syntax-case and syntax in place of syntax-rules.

(define-syntax or (lambda (x) (syntax-case x () ((_) (syntax #f)) ((_ e) (syntax e)) ((_ e1 e2 e3 ...) (syntax (let ((t e1)) (if t t (or e2 e3 ...))))))))

In this version, the lambda expression that produces the transformer is explicit, as are the syntax forms in the output part of each clause. Any syntax-rules form can be expressed with syntax-case by making the lambda expression and syntax expressions explicit. This observation leads to the following definition of syntax-rules in terms of syntax-case.

(define-syntax syntax-rules (lambda (x) (syntax-case x () ((_ (i ...) ((keyword . pattern) template) ...) (syntax (lambda (x) (syntax-case x (i ...) ((dummy . pattern) (syntax template)) ...)))))))

The unreferenced pattern variable dummy is used in place of each keyword since the first position of each syntax-rules pattern is always ignored.

Since the lambda and syntax expressions are implicit in a syntax-rules form, definitions expressed with syntax-rules are often shorter than the equivalent definitions expressed with syntax-case. The choice of which to use when either suffices is a matter of taste, but many transformers that can be written easily with syntax-case cannot be written easily or at all with syntax-rules (see Section 8.4).

procedure: (identifier? obj)
returns: #t if obj is an identifier, #f otherwise

identifier? is often used within fenders to verify that certain subforms of an input form are identifiers, as in the definition of unnamed let below.

(define-syntax let (lambda (x) (define ids? (lambda (ls) (or (null? ls) (and (identifier? (car ls)) (ids? (cdr ls)))))) (syntax-case x () ((_ ((i v) ...) e1 e2 ...) (ids? (syntax (i ...))) (syntax ((lambda (i ...) e1 e2 ...) v ...))))))

Syntactic extensions ordinarily take the form (keyword subform ...), but the syntax-case system permits them to take the form of singleton identifiers as well. For example, the keyword pcar in the expression below may be used both as an identifier (in which case it expands into a call to car) or as a structured form (in which case it expands into a call to set-car!).

(let ((p (cons 0 #f))) (define-syntax pcar (lambda (x) (syntax-case x () (_ (identifier? x) (syntax (car p))) ((_ v) (syntax (set-car! p v)))))) (let ((a pcar)) (pcar 1) (list a pcar))) (0 1)

The fender (identifier? x) is used to recognize the singleton identifier case.

procedure: (free-identifier=? identifier₁ identifier₂)
procedure: (bound-identifier=? identifier₁ identifier₂)
returns: see below

Symbolic names alone do not distinguish identifiers unless the identifiers are to be used only as symbolic data. The predicates free-identifier=? and bound-identifier=? are used to compare identifiers according to their intended use as free references or bound identifiers in a given context.

free-identifier=? is used to determine whether two identifiers would be equivalent if they were to appear as free identifiers in the output of a transformer. Because identifier references are lexically scoped, this means that (free-identifier=? id₁ id₂) is true if and only if the identifiers id₁ and id₂ refer to the same lexical or top-level binding. (For this comparison, all variables are assumed to have top-level bindings, whether defined yet or not.) Literal identifiers (auxiliary keywords) appearing in syntax-case patterns (such as else in case and cond) are matched with free-identifier=?.

Similarly, bound-identifier=? is used to determine if two identifiers would be equivalent if they were to appear as bound identifiers in the output of a transformer. In other words, if bound-identifier=? returns true for two identifiers, a binding for one will capture references to the other within its scope. In general, two identifiers are bound-identifier=? only if both are present in the original program or both are introduced by the same transformer application (perhaps implicitly---see datum->syntax-object). bound-identifier=? can be used for detecting duplicate identifiers in a binding construct or for other preprocessing of a binding construct that requires detecting instances of the bound identifiers.

The definition below is equivalent to the earlier definition of a simplified version of cond with syntax-rules, except that else is recognized via an explicit call to free-identifier? within a fender rather than via inclusion in the literals list.

(define-syntax cond (lambda (x) (syntax-case x () ((_ (e0 e1 e2 ...)) (and (identifier? (syntax e0)) (free-identifier=? (syntax e0) (syntax else))) (syntax (begin e1 e2 ...))) ((_ (e0 e1 e2 ...)) (syntax (if e0 (begin e1 e2 ...)))) ((_ (e0 e1 e2 ...) c1 c2 ...) (syntax (if e0 (begin e1 e2 ...) (cond c1 c2 ...)))))))

With either definition of cond, else is not recognized as an auxiliary keyword if an enclosing lexical binding for else exists. For example,

(let ((else #f)) (cond (else (write "oops"))))

does not write "oops", since else is bound lexically and is therefore not the same else that appears in the definition of cond.

The following definition of unnamed let uses bound-identifier=? to detect duplicate identifiers.

(define-syntax let (lambda (x) (define ids? (lambda (ls) (or (null? ls) (and (identifier? (car ls)) (ids? (cdr ls)))))) (define unique-ids? (lambda (ls) (or (null? ls) (and (let notmem? ((x (car ls)) (ls (cdr ls))) (or (null? ls) (and (not (bound-identifier=? x (car ls))) (notmem? x (cdr ls))))) (unique-ids? (cdr ls)))))) (syntax-case x () ((_ ((i v) ...) e1 e2 ...) (and (ids? (syntax (i ...))) (unique-ids? (syntax (i ...)))) (syntax ((lambda (i ...) e1 e2 ...) v ...))))))

With the definition of let above, the expression

(let ((a 3) (a 4)) (+ a a))

results in a syntax error, whereas

(let-syntax ((dolet (lambda (x) (syntax-case x () ((_ b) (syntax (let ((a 3) (b 4)) (+ a b)))))))) (dolet a))

evaluates to 7 since the identifier a introduced by dolet and the identifier a extracted from the input form are not bound-identifier=?. Since both occurrences of a, however, if left as free references, would refer to the same (top-level) binding for a, free-identifier=? would not distinguish them.

Two identifiers that are free-identifier=? may not be bound-identifier=?. An identifier introduced by a transformer may refer to the same enclosing binding as an identifier not introduced by the transformer, but an introduced binding for one will not capture references to the other. On the other hand, identifiers that are bound-identifier=? are free-identifier=?, as long as the identifiers have valid bindings in the context where they are compared.

syntax: (with-syntax ((pattern val) ...) exp₁ exp₂ ...)
returns: the value of the last exp_i

It is sometimes useful to construct a transformer's output in separate pieces, then put the pieces together. with-syntax facilitates this by allowing the creation of local pattern bindings.

pattern is identical in form to a syntax-case pattern. The value of each val is computed and destructured according to the corresponding pattern, and pattern variables within the pattern are bound as with syntax-case to appropriate portions of the value within exp₁ exp₂ ....

with-syntax may be defined as a syntactic extension in terms of syntax-case.

(define-syntax with-syntax (lambda (x) (syntax-case x () ((_ ((p e0) ...) e1 e2 ...) (syntax (syntax-case (list e0 ...) () ((p ...) (begin e1 e2 ...))))))))

The following definitions of full cond and case demonstrate the use of with-syntax to support transformers that employ recursion internally to construct their output.

(define-syntax cond (lambda (x) (syntax-case x () ((_ c1 c2 ...) (let f ((c1 (syntax c1)) (cmore (syntax (c2 ...)))) (if (null? cmore) (syntax-case c1 (else =>) ((else e1 e2 ...) (syntax (begin e1 e2 ...))) ((e0) (syntax (let ((t e0)) (if t t)))) ((e0 => e1) (syntax (let ((t e0)) (if t (e1 t))))) ((e0 e1 e2 ...) (syntax (if e0 (begin e1 e2 ...))))) (with-syntax ((rest (f (car cmore) (cdr cmore)))) (syntax-case c1 (=>) ((e0) (syntax (let ((t e0)) (if t t rest)))) ((e0 => e1) (syntax (let ((t e0)) (if t (e1 t) rest)))) ((e0 e1 e2 ...) (syntax (if e0 (begin e1 e2 ...) rest)))))))))))

(define-syntax case (lambda (x) (syntax-case x () ((_ e c1 c2 ...) (with-syntax ((body (let f ((c1 (syntax c1)) (cmore (syntax (c2 ...)))) (if (null? cmore) (syntax-case c1 (else) ((else e1 e2 ...) (syntax (begin e1 e2 ...))) (((k ...) e1 e2 ...) (syntax (if (memv t '(k ...)) (begin e1 e2 ...))))) (with-syntax ((rest (f (car cmore) (cdr cmore)))) (syntax-case c1 () (((k ...) e1 e2 ...) (syntax (if (memv t '(k ...)) (begin e1 e2 ...) rest))))))))) (syntax (let ((t e)) body)))))))

procedure: (syntax-object->datum obj)
returns: obj stripped of syntactic information

The procedure syntax-object->datum strips all syntactic information from a syntax object and returns the corresponding Scheme "datum." Identifiers stripped in this manner are converted to their symbolic names, which can then be compared with eq?. Thus, a predicate symbolic-identifier=? might be defined as follows.

(define symbolic-identifier=? (lambda (x y) (eq? (syntax-object->datum x) (syntax-object->datum y))))

Two identifiers that are free-identifier=? are symbolic-identifier=?; in order to refer to the same binding, two identifiers must have the same name. The converse is not always true, since two identifiers may have the same name but different bindings.

procedure: (datum->syntax-object template-identifier obj)
returns: a syntax object

datum->syntax-object constructs a syntax object from obj that contains the same contextual information as template-identifier, with the effect that the syntax object behaves as if it were introduced into the code when template-identifier was introduced. The template identifier is often the keyword of an input form, extracted from the form, and the object is often a symbol naming an identifier to be constructed.

datum->syntax-object allows a transformer to "bend" lexical scoping rules by creating implicit identifiers that behave as if they were present in the input form, thus permitting the definition of syntactic extensions that introduce visible bindings for or references to identifiers that do not appear explicitly in the input form. For example, we can define a loop expression that binds the variable break to an escape procedure within the loop body.

(define-syntax loop (lambda (x) (syntax-case x () ((k e ...) (with-syntax ((break (datum->syntax-object (syntax k) 'break))) (syntax (call-with-current-continuation (lambda (break) (let f () e ... (f)))))))))) (let ((n 3) (ls '())) (loop (if (= n 0) (break ls)) (set! ls (cons 'a ls)) (set! n (- n 1)))) (a a a)

Were we to define loop as

(define-syntax loop (lambda (x) (syntax-case x () ((_ e ...) (syntax (call-with-current-continuation (lambda (break) (let f () e ... (f)))))))))

the variable break would not be visible in e ....

It is also useful for obj to represent an arbitrary Scheme form, as demonstrated by the following definition of include, an expand-time version of load.

(define-syntax include (lambda (x) (define read-file (lambda (fn k) (let ((p (open-input-file fn))) (let f ((x (read p))) (if (eof-object? x) (begin (close-input-port p) '()) (cons (datum->syntax-object k x) (f (read p)))))))) (syntax-case x () ((k filename) (let ((fn (syntax-object->datum (syntax filename)))) (with-syntax (((exp ...) (read-file fn (syntax k)))) (syntax (begin exp ...))))))))

(include "filename") expands into a begin expression containing the forms found in the file named by "filename". For example, if the file f-def.ss contains the expression (define f (lambda () x)), the expression

(let ((x "okay")) (include "f-def.ss") (f))

evaluates to "okay".

The definition of include uses datum->syntax-object to convert the objects read from the file into syntax objects in the proper lexical context, so that identifier references and definitions within those expressions are scoped where the include form appears.

procedure: (generate-temporaries list)
returns: a list of distinct generated identifiers

Transformers can introduce a fixed number of identifiers into their output by naming each identifier. In some cases, however, the number of identifiers to be introduced depends upon some characteristic of the input expression. A straightforward definition of letrec, for example, requires as many temporary identifiers as there are binding pairs in the input expression. The procedure generate-temporaries is used to construct lists of temporary identifiers.

list may be any list; its contents are not important. The number of temporaries generated is the number of elements in list. Each temporary is guaranteed to be different from all other identifiers.

A definition of letrec that uses generate-temporaries is shown below.

(define-syntax letrec (lambda (x) (syntax-case x () ((_ ((i v) ...) e1 e2 ...) (with-syntax (((t ...) (generate-temporaries (syntax (i ...))))) (syntax (let ((i #f) ...) (let ((t v) ...) (set! i t) ... (let () e1 e2 ...)))))))))

Any transformer that uses generate-temporaries in this fashion can be rewritten to avoid using it, albeit with a loss of clarity. The trick is to use a recursively defined intermediate form that generates one temporary per expansion step and completes the expansion after enough temporaries have been generated. Here is a definition of let-values (see page 115) that uses this technique to support multiple sets of bindings.

(define-syntax let-values (syntax-rules () ((_ () f1 f2 ...) (let () f1 f2 ...)) ((_ ((fmls1 expr1) (fmls2 expr2) ...) f1 f2 ...) (lvhelp fmls1 () () expr1 ((fmls2 expr2) ...) (f1 f2 ...))))) (define-syntax lvhelp (syntax-rules () ((_ (x1 . fmls) (x ...) (t ...) e m b) (lvhelp fmls (x ... x1) (t ... tmp) e m b)) ((_ () (x ...) (t ...) e m b) (call-with-values (lambda () e) (lambda (t ...) (let-values m (let ((x t) ...) . b))))) ((_ xr (x ...) (t ...) e m b) (call-with-values (lambda () e) (lambda (t ... . tmpr) (let-values m (let ((x t) ... (xr tmpr)) . b)))))))

The implementation of lvhelp is complicated by the need to evaluate all of the right-hand-side expressions before creating any of the bindings and by the need to support improper formals lists.

A definition of letrec that does not use generate-temporaries is left as an exercise for the reader.

Section 8.4. Examples

This section presents a series of illustrative syntactic extensions defined with either syntax-rules or syntax-case, starting with a few simple but useful syntactic extensions and ending with a fairly complex mechanism for defining structures with automatically generated constructors, predicates, field accessors, and field setters.

The simplest example in this section is the following definition of rec. rec is a syntactic extension that permits internally recursive anonymous (not externally named) procedures to be created with minimal effort.

(define-syntax rec (syntax-rules () ((_ x e) (letrec ((x e)) x)))) (map (rec sum (lambda (x) (if (= x 0) 0 (+ x (sum (- x 1)))))) '(0 1 2 3 4 5)) (0 1 3 6 10 15)

Using rec, we can define the full let (both unnamed and named) as follows.

(define-syntax let (syntax-rules () ((_ ((x v) ...) e1 e2 ...) ((lambda (x ...) e1 e2 ...) v ...)) ((_ f ((x v) ...) e1 e2 ...) ((rec f (lambda (x ...) e1 e2 ...)) v ...))))

Of course, we can also define let directly in terms of letrec, although the definition is a bit less clear.

(define-syntax let (syntax-rules () ((_ ((x v) ...) e1 e2 ...) ((lambda (x ...) e1 e2 ...) v ...)) ((_ f ((x v) ...) e1 e2 ...) ((letrec ((f (lambda (x ...) e1 e2 ...))) f) v ...))))

These definitions rely upon the fact that the first pattern cannot match a named let, since the first subform of a named let must be an identifier, not a list of bindings. The following definition uses a fender to make this check more robust.

(define-syntax let (lambda (x) (syntax-case x () ((_ ((x v) ...) e1 e2 ...) (syntax ((lambda (x ...) e1 e2 ...) v ...))) ((_ f ((x v) ...) e1 e2 ...) (identifier? (syntax f)) (syntax ((rec f (lambda (x ...) e1 e2 ...)) v ...))))))

Of course, to be completely robust, the ids? and all-ids? checks employed in the definition of unnamed let in Section 8.3 should be employed here as well.

Both variants of let are easily described by simple one-line patterns, but do requires a bit more work. The precise syntax of do cannot be expressed directly with a single pattern because some of the bindings in a do expression's binding list may take the form (var val) while others take the form (var val update). The following definition of do uses syntax-case internally to parse the bindings separately from the overall form.

(define-syntax do (lambda (x) (syntax-case x () ((_ (binding ...) (test res ...) exp ...) (with-syntax ((((var val update) ...) (map (lambda (b) (syntax-case b () ((var val) (syntax (var val var))) ((var val update) (syntax (var val update))))) (syntax (binding ...))))) (syntax (let doloop ((var val) ...) (if test (begin (if #f #f) res ...) (begin exp ... (doloop update ...))))))))))

The odd looking expression (if #f #f) is inserted before the result expressions res ... in case no result expressions are provided, since begin requires at least one subexpression. The value of (if #f #f) is unspecified, which is what we want since the value of do is unspecified if no result expressions are provided. At the expense of a bit more code, we could use syntax-case to determine whether any result expressions are provided and to produce a loop with either a one- or two-armed if as appropriate. The resulting expansion would be cleaner but semantically equivalent.

As mentioned in Section 8.2, ellipses lose their special meaning within templates of the form (... template), This fact allows syntactic extensions to expand into syntax definitions containing ellipses. This usage is illustrated by the definition below of be-like-begin.

(define-syntax be-like-begin (syntax-rules () ((_ name) (define-syntax name (syntax-rules () ((_ e0 e1 (... ...)) (begin e0 e1 (... ...))))))))

With be-like-begin defined in this manner, (be-like-begin sequence) has the same effect as the following definition of sequence.

(define-syntax sequence (syntax-rules () ((_ e0 e1 ...) (begin e0 e1 ...))))

That is, a sequence form becomes equivalent to a begin form so that, for example:

(sequence (display "Say what?") (newline))

prints "Say what?" followed by a newline.

The following example shows how one might restrict if expressions within a given expression to require the "else" (alternative) subexpression by defining the local if in terms of the top-level if.

(let-syntax ((if (lambda (x) (syntax-case x () ((_ e1 e2 e3) (syntax (if e1 e2 e3))))))) (if 1 2 3)) 2 (let-syntax ((if (lambda (x) (syntax-case x () ((_ e1 e2 e3) (syntax (if e1 e2 e3))))))) (if 1 2)) error

Although this local definition of if looks simple enough, there are a few subtle ways in which an attempt to write it might go wrong. If letrec-syntax were used in place of let-syntax, the identifier if inserted into the output would refer to the local if rather than the top-level if, and expansion would loop indefinitely.

Similarly, if the underscore were replaced with the identifier if, expansion would again loop indefinitely. The if appearing in the template (if e1 e2 e3) would be treated as a pattern variable bound to the corresponding identifier if from the input form, which denotes the local version of if.

Placing if in the list of literals in an attempt to patch up the latter version would not work either. This would cause syntax-case to compare the literal if in the pattern, which would be scoped outside the let-syntax expression, with the if in the input expression, which would be scoped inside the let-syntax. Since they would not refer to the same binding, they would not be free-identifier=?, and a syntax error would result.

The conventional use of underscore ( _ ) helps the programmer avoid situations like these in which the wrong identifier is matched against or inserted by accident.

It is an error to generate a reference to an identifier that is not present within the context of an input form, which can happen if the "closest enclosing lexical binding" for an identifier inserted into the output of a transformer does not also enclose the input form. For example,

(let-syntax ((divide (lambda (x) (let ((/ +)) (syntax-case x () ((_ e1 e2) (syntax (/ e1 e2)))))))) (let ((/ *)) (divide 2 1)))

results in an error to the effect that / is referenced in an invalid context, since the occurrence of / in the output of divide is a reference to the variable / bound by the let expression within the transformer.

As noted in the description of identifier? in Section 8.3, singleton identifiers can be treated as syntactic extensions and expanded into arbitrary forms. Often, it is necessary to treat the case where an identifier appears in the first position of a structured expression differently from the case where it appears elsewhere, as in the pcar example given in the description for identifier?. In other situations, both cases must or may be treated the same. The form identifier-syntax defined below can make doing so more convenient.

(define-syntax identifier-syntax (lambda (x) (syntax-case x () ((_ e) (syntax (lambda (x) (syntax-case x () (id (identifier? (syntax id)) (syntax e)) ((id x (... ...)) (identifier? (syntax id)) (syntax (e x (... ...))))))))))) (let ((x 0)) (define-syntax x++ (identifier-syntax (let ((t x)) (set! x (+ t 1)) t))) (let ((a x++)) (list a x))) (0 1)

The following example uses identifier-syntax, datum->syntax-object, and local syntax definitions to define a form of method, one of the basic building blocks of object-oriented programming (OOP) systems. A method expression is similar to a lambda expression, except that in addition to the formal parameters and body, a method expression also contains a list of instance variables (ivar ...). When a method is invoked, it is always passed an object (instance), represented as a vector of fields corresponding to the instance variables, and zero or more additional arguments. Within the method body, the object is bound implicitly to the identifier self and the additional arguments are bound to the formal parameters. The fields of the object may be accessed or altered within the method body via instance variable references or assignments.

(define-syntax method (lambda (x) (syntax-case x () ((k (ivar ...) formals e1 e2 ...) (with-syntax (((index ...) (let f ((i 0) (ls (syntax (ivar ...)))) (if (null? ls) '() (cons i (f (+ i 1) (cdr ls)))))) (self (datum->syntax-object (syntax k) 'self)) (set! (datum->syntax-object (syntax k) 'set!))) (syntax (lambda (self . formals) (let-syntax ((ivar (identifier-syntax (vector-ref self index))) ...) (let-syntax ((set! (syntax-rules (ivar ...) ((_ ivar e) (vector-set! self index e)) ... ((_ x e) (set! x e))))) e1 e2 ...)))))))))

Local bindings for ivar ... and for set! make the fields of the object appear to be ordinary variables, with references and assignments translated into calls to vector-ref and vector-set!. datum->syntax-object is used to make the introduced bindings of self and set! visible in the method body. Nested let-syntax expressions are needed so that the identifiers ivar ... serving as auxiliary keywords for the local version of set! are scoped properly. The examples below demonstrate simple uses of method.

(let ((m (method (a) (x) (list a x self)))) (m #(1) 2)) (1 2 #(1)) (let ((m (method (a) (x) (set! a x) (set! x (+ a x)) (list a x self)))) (m #(1) 2)) (2 4 #(2))

In a complete OOP system based on method, the instance variables ivar ... would likely be drawn from class declarations, not listed explicitly in the method forms, although the same techniques would be used to make instance variables appear as ordinary variables within method bodies.

The next example defines a define-integrable form that is similar to define for procedure definitions except that it causes the code for the procedure to be integrated, or inserted, wherever a direct call to the procedure is found. No semantic difference is visible between procedures defined with define-integrable and those defined with define, except that a top-level define-integrable form must appear before the first reference to the defined identifier, and syntactic extensions within the body of the defined procedure are expanded at the point of call. Lexical scoping is preserved, the actual parameters in an integrated call are evaluated once and at the proper time, integrable procedures may be used as first-class values, and recursive procedures do not cause indefinite recursive expansion.

A define-integrable has the following form.

(define-integrable name lambda-expression)

A define-integrable form expands into a pair of definitions: a syntax definition of name and a variable definition of a generated name, residual-name. The transformer for name converts apparent calls to name into direct calls to lambda-expression. Since the resulting forms are merely direct lambda applications (the equivalent of let expressions), the actual parameters are evaluated exactly once and before evaluation of the procedure's body, as required. All other references to name are replaced with references to residual-name. The definition of residual-name binds it to the value of lambda-expression. This allows the procedure to be used as a first-class value. Within lambda-expression, wherever it appears, name is rebound to a transformer that expands all references into references to residual-name. The use of fluid-let-syntax for this purpose prevents indefinite expansion from indirect recursion among integrable procedures. This allows the procedure to be recursive without causing indefinite expansion. Nothing special is done by define-integrable to maintain lexical scoping, since lexical scoping is maintained automatically by the expander.

(define-syntax define-integrable (lambda (x) (define make-residual-name (lambda (name) (datum->syntax-object name (string->symbol (string-append "residual-" (symbol->string (syntax-object->datum name))))))) (syntax-case x (lambda) ((_ name (lambda formals form1 form2 ...)) (identifier? (syntax name)) (with-syntax ((xname (make-residual-name (syntax name)))) (syntax (begin (define-syntax name (lambda (x) (syntax-case x () (_ (identifier? x) (syntax xname)) ((_ arg (... ...)) (syntax ((fluid-let-syntax ((name (identifier-syntax xname))) (lambda formals form1 form2 ...)) arg (... ...))))))) (define xname (fluid-let-syntax ((name (identifier-syntax xname))) (lambda formals form1 form2 ...))))))))))

Some Scheme compilers integrate procedures automatically when it is appropriate to do so. Compilers cannot normally integrate procedures bound at top-level, however, since code that assigns top-level variables can be introduced into the system (via eval or load) at any time. define-integrable can be used to force the integration of procedures bound at top-level, even if the integration of locally bound procedures is left to the compiler.

The final example of this section defines a simple structure definition facility that represents structures as vectors with named fields. Structures are defined with define-structure, which takes the form:

(define-structure name field ...)

where name names the structure and field ... names its fields. define-structure expands into a series of generated definitions: a constructor make-name, a type predicate name?, and one accessor name-field and setter set-name-field! per field name.

(define-syntax define-structure (lambda (x) (define gen-id (lambda (template-id . args) (datum->syntax-object template-id (string->symbol (apply string-append (map (lambda (x) (if (string? x) x (symbol->string (syntax-object->datum x)))) args)))))) (syntax-case x () ((_ name field ...) (with-syntax ((constructor (gen-id (syntax name) "make-" (syntax name))) (predicate (gen-id (syntax name) (syntax name) "?")) ((access ...) (map (lambda (x) (gen-id x (syntax name) "-" x)) (syntax (field ...)))) ((assign ...) (map (lambda (x) (gen-id x "set-" (syntax name) "-" x "!")) (syntax (field ...)))) (structure-length (+ (length (syntax (field ...))) 1)) ((index ...) (let f ((i 1) (ids (syntax (field ...)))) (if (null? ids) '() (cons i (f (+ i 1) (cdr ids))))))) (syntax (begin (define constructor (lambda (field ...) (vector 'name field ...))) (define predicate (lambda (x) (and (vector? x) (= (vector-length x) structure-length) (eq? (vector-ref x 0) 'name)))) (define access (lambda (x) (vector-ref x index))) ... (define assign (lambda (x update) (vector-set! x index update))) ...)))))))

The constructor accepts as many arguments as there are fields in the structure and creates a vector whose first element is the symbol name and whose remaining elements are the argument values. The type predicate returns true if its argument is a vector of the expected length whose first element is name.

Since a define-structure form expands into a begin containing definitions, it is itself a definition and can be used wherever definitions are valid.

The generated identifiers are created with datum->syntax-object to allow the identifiers to be visible where the define-structure form appears.

The examples below demonstrate the use of define-structure.

(define-structure tree left right) (define t (make-tree (make-tree 0 1) (make-tree 2 3))) t #(tree #(tree 0 1) #(tree 2 3)) (tree? t) #t (tree-left t) #(tree 0 1) (tree-right t) #(tree 2 3) (set-tree-left! t 0) t #(tree 0 #(tree 2 3))

Since the bodies of the generated procedures are short and simple, it may be desirable to use define-integrable as defined above in place of define for some or all of the generated procedure definitions.

R. Kent Dybvig / The Scheme Programming Language, Third Edition
Copyright © 2003 The MIT Press. Electronically reproduced by permission.
Illustrations © 2003 Jean-Pierre Hébert
ISBN 0-262-54148-3 / LOC QA76.73.S34D93
to order this book / about this book

http://www.scheme.com