Chapter 9. Syntactic Extension

The standard syntactic extension forms define-syntax, syntax-rules, let-syntax, and letrec-syntax are described in The Scheme Programming Language, Second Edition and in the Revised5 Report on Scheme [20]. This chapter describes syntax-case and related features. Together, these features allow the expression of a more general class of transformations than may be expressed via syntax-rules. The description of syntax-case appearing here overlaps significantly with Chapter 8 of The Scheme Programming Language, Second Edition [9], although a few additional features are documented here, while The Scheme Programming Language, Second Edition contains a section on examples (Section 8.4) that is not present here.

This chapter also contains a description of Chez Scheme's module facility, which is directly supported by the syntax-case expander.

A syntactic extension typically takes the form (keyword subform ...), where keyword is the identifier that names the syntactic extension. The syntax of each subform varies from one syntactic extension to another. Syntactic extensions can also take the form of improper lists or even singleton identifiers, although this is less common.

New syntactic extensions are defined by associating keywords with transformation procedures, or transformers. Syntactic extensions are defined globally using top-level define-syntax forms or within the scope of particular expressions using let-syntax, letrec-syntax, internal define-syntax, or fluid-let-syntax.

Syntactic extensions are expanded into core forms at the start of evaluation (before compilation or interpretation) by a syntax expander. The expander is invoked once for each top-level form in a program. If the expander encounters a syntactic extension, it invokes the associated transformer to expand the syntactic extension, then repeats the expansion process for the form returned by the transformer. If the expander encounters a core syntactic form, it recursively processes the subforms, if any, and reconstructs the form from the expanded subforms. Information about identifier bindings is maintained during expansion to enforce lexical scoping.


Section 9.1. Keyword Bindings

Keyword bindings may be established via the Revised5 Report standard forms define-syntax, let-syntax, or letrec-syntax. Although the Revised5 Report restricts define-syntax to top level, the syntax-case expander allows define-syntax to appear wherever other definitions may appear.

The syntax-case expander also permits temporary bindings to be established via fluid-let-syntax.


syntax: (fluid-let-syntax ((keyword exp) ...) form1 form2 ...)
returns: see explanation

Each exp must evaluate to a transformer. fluid-let-syntax is similar to let-syntax, except that instead of introducing new bindings for the keywords keyword ..., fluid-let-syntax temporarily alters the existing bindings for the keywords during the expansion of its body. That is, during the expansion of form1 form2 ..., the visible lexical (or top-level) binding for each keyword is temporarily replaced by a new association between the keyword and the corresponding transformer. This affects any reference to the keyword that resolves to the same lexical (or top-level) binding whether the reference occurs in the text of the body or is introduced during its expansion. In contrast, let-syntax captures only those references that occur within the text of its body.

The following example shows how fluid-let-syntax differs from let-syntax.

(let ((f (lambda (x) (+ x 1))))
  (let-syntax ((g (syntax-rules ()
                    ((_ x) (f x)))))
    (let-syntax ((f (syntax-rules ()
                      ((_ x) x))))
      (g 1))))  2

(let ((f (lambda (x) (+ x 1))))
  (let-syntax ((g (syntax-rules ()
                    ((_ x) (f x)))))
    (fluid-let-syntax ((f (syntax-rules ()
                            ((_ x) x))))
      (g 1))))  1

The two expressions are identical except that the inner let-syntax form in the first expression is a fluid-let-syntax form in the second. In the first expression, the f occurring in the expansion of (g 1) refers to the let-bound variable f, whereas in the second it refers to the keyword f by virtue of the fluid syntax binding for f.


Section 9.2. Syntax-Case

Transformers may be created via the Revised5 Report standard form syntax-rules. More general transformers may be defined via the syntax-case form and a set of related features supported by the syntax-case expander. The syntax-case expander permits complex transformations to be specified, including transformations that "bend" lexical scoping in a controlled manner, allowing a much broader class of syntactic extensions to be defined. Any transformer that may be defined using syntax-rules may be rewritten easily to use syntax-case instead; in fact, syntax-rules itself may be defined as a syntactic extension in terms of syntax-case, as demonstrated within the description of syntax below.

With this mechanism, transformers are procedures of one argument. The argument is a syntax object representing the form to be processed. The return value is a syntax object representing the output form. A syntax object contains contextual information about a form in addition to its structure. This contextual information is used by the expander to maintain lexical scoping.

A syntax object representing an identifier is itself referred to as an identifier; thus, the term identifier may refer either to the syntactic entity (symbol, variable, or keyword) or to the concrete representation of the syntactic entity as a syntax object. It is rarely necessary to distinguish the two uses.

Transformers destructure their input with syntax-case and rebuild their output with syntax. These two forms alone are sufficient for defining many syntactic extensions, including any that can be defined using syntax-rules. They are described below along with a set of additional forms and procedures that provide added functionality.


syntax: (syntax-case exp (literal ...) clause ...)
returns: see below

Each literal must be an identifier. Each clause must take one of the following two forms:

(pattern output-expression)
(pattern fender output-expression)

Patterns consist of list structure, vector structure, identifiers, and constants. Each identifier within a pattern is either a literal, a pattern variable, or an ellipsis. The identifier ... is an ellipsis. Any identifier other than ... is a literal if it appears in the list of literals (literal ...); otherwise, it is a pattern variable. Literals serve as auxiliary keywords, such as else in case and cond expressions. List and vector structure within a pattern specifies the basic structure required of the input, pattern variables specify arbitrary substructure, and literals and constants specify atomic pieces that must match exactly. Ellipses specify repeated occurrences of the subpatterns they follow.

syntax-case first evaluates exp, then attempts to match the resulting value against the pattern from the first clause. This value is usually a syntax object, but it may be any Scheme object. An input form F matches a pattern P if and only if

If the value of exp matches the pattern and no fender is present, output-expression is evaluated and its value returned as the value of the syntax-case expression. If the value of exp does not match the pattern, the value is compared against the next clause, and so on. An error is signaled if the value does not match any of the patterns.

If the optional fender is present, it serves as an additional constraint on acceptance of a clause. If the value of the syntax-case exp matches the pattern for a given clause, the corresponding fender is evaluated. If fender evaluates to a true value, the clause is accepted; otherwise, the clause is rejected as if the input had failed to match the pattern. Fenders are logically a part of the matching process, i.e., they specify additional matching constraints beyond the basic structure of an expression.

Pattern variables contained within a clause's pattern are bound to the corresponding pieces of the input value within the clause's fender (if present) and output-expression. Pattern variables occupy the same name space as program variables and keywords; pattern variable bindings created by syntax-case can shadow (and be shadowed by) program variable and keyword bindings as well as other pattern variable bindings. Pattern variables, however, can be referenced only within syntax expressions.

See the examples following the description of syntax.


syntax: (syntax template)
returns: see below

A syntax expression is like a quote expression except that the values of pattern variables appearing within template are inserted into template, and contextual information associated with any nonlist, nonvector items from the template is retained in the output. A syntax template is identical to a syntax-rules template and is treated similarly.

The definition of or below demonstrates the use of syntax-case.

(define-syntax or
  (lambda (x)
    (syntax-case x ()
      ((_) (syntax #f))
      ((_ e) (syntax e))
      ((_ e1 e2 e3 ...)
       (syntax (let ((t e1)) (if t t (or e2 e3 ...))))))))

The input patterns specify that the input must consist of the keyword and zero or more subexpressions. An underscore ( _ ), which is an ordinary pattern variable, is used by convention for the keyword position to remind the programmer and anyone reading the definition that the keyword position never fails to contain the expected keyword and need not be matched. If more than one subexpression is present (third clause), the expanded code must both test the value of the first subexpression and return the value if it is not false. In order to avoid evaluating the expression twice, the transformer introduces a binding for the temporary variable t.

The expansion algorithm maintains lexical scoping automatically by renaming local identifiers as necessary. Thus, the binding for t introduced by the transformer is visible only within code introduced by the transformer and not within subforms of the input. Similarly, the references to the identifiers let and if are unaffected by any bindings present in the context of the input. Thus, the expression

(let ((if #f))
  (let ((t 'okay))
    (or if t)))  okay

might be transformed into the equivalent expression below.

((lambda (if1)
   ((lambda (t1)
      ((lambda (t2)
         (if t2 t2 t1))
       if1))
    'okay))
 #f)  okay

In this sample expansion, if1, t1, and t2 represent identifiers to which if and t in the original expression and t in the expansion of or have been renamed.

Any syntax-rules form can be expressed with syntax-case by making the lambda and syntax expressions explicit. This observation leads to the following definition of syntax-rules in terms of syntax-case.

(define-syntax syntax-rules
  (lambda (x)
    (syntax-case x ()
      ((_ (i ...) ((keyword . pattern) template) ...)
       (syntax (lambda (x)
                 (syntax-case x (i ...)
                   ((dummy . pattern) (syntax template))
                   ...)))))))

The unreferenced pattern variable dummy is used in place of each keyword since the first position of each syntax-rules pattern is always ignored.

Since the lambda and syntax expressions are implicit in a syntax-rules form, definitions expressed with syntax-rules are often shorter than the equivalent definitions expressed with syntax-case. The choice of which to use when either suffices is a matter of taste, but many transformers that can be written easily with syntax-case cannot be written easily or at all with syntax-rules.


procedure: (identifier? obj)
returns: #t if obj is an identifier, #f otherwise

identifier? is often used within fenders to verify that certain subforms of an input form are identifiers, as in the definition of unnamed let below.

(define-syntax let
  (lambda (x)
    (define ids?
      (lambda (ls)
        (or (null? ls)
            (and (identifier? (car ls))
                 (ids? (cdr ls))))))
    (syntax-case x ()
      ((_ ((i v) ...) e1 e2 ...)
       (ids? (syntax (i ...)))
       (syntax ((lambda (i ...) e1 e2 ...) v ...))))))

Syntactic extensions ordinarily take the form (keyword subform ...), but the syntax-case system permits them to take the form of singleton identifiers as well. For example, the keyword pcar in the expression below may be used both as an identifier (in which case it expands into a call to car) or as a structured form (in which case it expands into a call to set-car!).

(let ((p (cons 0 #f)))
  (define-syntax pcar
    (lambda (x)
      (syntax-case x ()
        (_ (identifier? x) (syntax (car p)))
        ((_ v) (syntax (set-car! p v))))))
  (let ((a pcar))
    (pcar 1)
    (list a pcar)))  (0 1)

The fender (identifier? x) is used to recognize the singleton identifier case. Additional control over the expansion of singleton identifiers may be exercised with identifier-syntax, which is described below.


syntax: (identifier-syntax tmpl)
syntax: (identifier-syntax (id1 tmpl1) ((set! id2 e2) tmpl2))
returns: a transformer

When a keyword is bound to a transformer produced by the first form of identifier-syntax, references to the keyword within the scope of the binding are replaced by tmpl.

(let ()
  (define-syntax a (identifier-syntax car))
  (list (a '(1 2 3)) a))  (1 #<procedure car>)

This form of identifier-syntax may be expressed as a syntax-case transformer as follows:

(define-syntax identifier-syntax
  (lambda (x)
    (syntax-case x ()
      ((_ e)
       (syntax
         (lambda (x)
           (syntax-case x ()
             (id
              (identifier? (syntax id))
              (syntax e))
             ((id x (... ...))
              (identifier? (syntax id))
              (syntax (e x (... ...)))))))))))

With the first form of identifier-syntax, attempting to assign the associated keyword with set! fails with an "invalid syntax" message. The second, more general, form of identifier-syntax permits the transformer to determine what happens when set! is used.

(let ((x (list 0)))
  (define-syntax a
    (identifier-syntax
      (id (car x))
      ((set! id e) (set-car! x e))))
  (let ((before a))
    (set! a 1)
    (list before a x)))  (0 1 (1))

This form cannot be defined in terms of syntax-case and is thus supported as primitive by the syntax-case expander.

The second form of identifier-syntax allows a simpler definition of the method form described on page 180 in The Scheme Programming Language, Second Edition.

(define-syntax method
  (lambda (x)
    (syntax-case x ()
      ((k (ivar ...) formals e1 e2 ...)
       (with-syntax (((index ...)
                      (let f ((i 0) (ls (syntax (ivar ...))))
                        (if (null? ls)
                            '()
                            (cons i (f (+ i 1) (cdr ls))))))
                     (self (datum->syntax-object (syntax k) 'self)))
         (syntax
           (lambda (self . formals)
             (let-syntax ((ivar (identifier-syntax
                                  (_ (vector-ref self index))
                                  ((set! _ e)
                                   (vector-set! self index e))))
                          ...)
               e1 e2 ...))))))))

This form might be used as a building block in an object-oriented subsystem. It returns a procedure that accepts an instance, self, and a set of additional arguments. The instance is assumed to be represented as a vector. Within the body of the method, e1 e2 ..., references and assignments to the instance variables ivar ... are translated into vector references and assignments to the elements of the instance. To support instance-variable assignments with only the first form of identifier-syntax, it is necessary to establish a local binding for set! that recognizes the instance variables ivar ... as auxiliary keywords (literals).


procedure: (bound-identifier=? identifier1 identifier2)
procedure: (free-identifier=? identifier1 identifier2)
procedure: (literal-identifier=? identifier1 identifier2)
returns: see below

Symbolic names alone do not distinguish identifiers unless the identifiers are to be used only as symbolic data. The predicates free-identifier=? and bound-identifier=? are used to compare identifiers according to their intended use as free references or bound identifiers in a given context.

bound-identifier=? is used to determine if two identifiers would be equivalent if they were to appear as bound identifiers in the output of a transformer. In other words, if bound-identifier=? returns true for two identifiers, a binding for one will capture references to the other within its scope. In general, two identifiers are bound-identifier=? only if both are present in the original program or both are introduced by the same transformer application (perhaps implicitly---see datum->syntax-object). bound-identifier=? can be used for detecting duplicate identifiers in a binding construct or for other preprocessing of a binding construct that requires detecting instances of the bound identifiers.

free-identifier=? is used to determine whether two identifiers would be equivalent if they were to appear as free identifiers in the output of a transformer. Because identifier references are lexically scoped, this means that (free-identifier=? id1 id2) is true if and only if the identifiers id1 and id2 refer to the same lexical or top-level binding. For this comparison, all variables are assumed to have top-level bindings, whether defined yet or not.

literal-identifier=? is similar to free-identifier=? except that the former equates top-level identifiers that come from different modules, even if they do not necessarily resolve to the same binding. syntax-rules employs literal-identifier=? to compare identifiers listed in the literals list against input identifiers. literal-identifier=? is intended for the comparison of auxiliary keywords such as else in cond and case, where no actual binding is involved. Literal identifiers (auxiliary keywords) appearing in syntax-case patterns (such as else in case and cond) are matched with literal-identifier=?.

Identifiers that are bound-identifier=? are not necessarily free-identifier=?, and identifiers that are free-identifier=? are not necessarily bound-identifier=?. Identifiers that are free-identifier=? are also free-identifier=?, but the converse does not necessarily hold.

The following definition of unnamed let uses bound-identifier=? to detect duplicate identifiers.

(define-syntax let
  (lambda (x)
    (define ids?
      (lambda (ls)
        (or (null? ls)
            (and (identifier? (car ls))
                 (ids? (cdr ls))))))
    (define unique-ids?
      (lambda (ls)
        (or (null? ls)
            (and (let notmem? ((x (car ls)) (ls (cdr ls)))
                   (or (null? ls)
                       (and (not (bound-identifier=? x (car ls)))
                            (notmem? x (cdr ls)))))
                 (unique-ids? (cdr ls))))))
    (syntax-case x ()
      ((_ ((i v) ...) e1 e2 ...)
       (and (ids? (syntax (i ...)))
            (unique-ids? (syntax (i ...))))
       (syntax ((lambda (i ...) e1 e2 ...) v ...))))))

With the definition of let above, the expression

(let ((a 3) (a 4)) (+ a a))

results in a syntax error, whereas

(let-syntax ((dolet (lambda (x)
                      (syntax-case x ()
                        ((_ b)
                         (syntax (let ((a 3) (b 4))
                                   (+ a b))))))))
  (dolet a))

evaluates to 7 since the identifier a introduced by dolet and the identifier a extracted from the input form are not bound-identifier=?. Since both occurrences of a, however, if left as free references, would refer to the same (top-level) binding for a, neither free-identifier=? nor literal-identifier=? would distinguish them.

The two definitions of a simplified version of cond below are equivalent; the first includes else in the literals list for syntax-case, whereas the second explicitly tests for else using literal-identifier=?.

(define-syntax cond
  (lambda (x)
    (syntax-case x (else)
      ((_ (else e1 e2 ...)) (syntax (begin e1 e2 ...)))
      ((_ (e0 e1 e2 ...)) (syntax (if e0 (begin e1 e2 ...))))
      ((_ (e0 e1 e2 ...) c1 c2 ...)
       (syntax (if e0 (begin e1 e2 ...) (cond c1 c2 ...)))))))

(define-syntax cond
  (lambda (x)
    (syntax-case x ()
      ((_ (e0 e1 e2 ...))
       (and (identifier? (syntax e0))
            (literal-identifier=? (syntax e0) (syntax else)))
       (syntax (begin e1 e2 ...)))
      ((_ (e0 e1 e2 ...)) (syntax (if e0 (begin e1 e2 ...))))
      ((_ (e0 e1 e2 ...) c1 c2 ...)
       (syntax (if e0 (begin e1 e2 ...) (cond c1 c2 ...)))))))

With either definition of cond, else is not recognized as an auxiliary keyword if an enclosing lexical binding for else exists. For example,

(let ((else #f))
  (cond (else (write "oops"))))

does not write "oops", since else is bound lexically and is therefore not the same else that appears in the definition of cond.


syntax: (with-syntax ((pattern val) ...) exp1 exp2 ...)
returns: the value of the last expi

It is sometimes useful to construct a transformer's output in separate pieces, then put the pieces together. with-syntax facilitates this by allowing the creation of local pattern bindings.

pattern is identical in form to a syntax-case pattern. The value of each val is computed and destructured according to the corresponding pattern, and pattern variables within the pattern are bound as with syntax-case to appropriate portions of the value within exp1 exp2 ....

with-syntax may be defined as a syntactic extension in terms of syntax-case.

(define-syntax with-syntax
  (lambda (x)
    (syntax-case x ()
      ((_ ((p e0) ...) e1 e2 ...)
       (syntax (syntax-case (list e0 ...) ()
                 ((p ...) (begin e1 e2 ...))))))))

The following definitions of full cond and case demonstrate the use of with-syntax to support transformers that employ recursion internally to construct their output.

(define-syntax cond
  (lambda (x)
    (syntax-case x ()
      ((_ c1 c2 ...)
       (let f ((c1 (syntax c1)) (cmore (syntax (c2 ...))))
         (if (null? cmore)
             (syntax-case c1 (else =>)
               ((else e1 e2 ...) (syntax (begin e1 e2 ...)))
               ((e0) (syntax (let ((t e0)) (if t t))))
               ((e0 => e1) (syntax (let ((t e0)) (if t (e1 t)))))
               ((e0 e1 e2 ...) (syntax (if e0 (begin e1 e2 ...)))))
             (with-syntax ((rest (f (car cmore) (cdr cmore))))
               (syntax-case c1 (=>)
                 ((e0) (syntax (let ((t e0)) (if t t rest))))
                 ((e0 => e1) (syntax (let ((t e0)) (if t (e1 t) rest))))
                 ((e0 e1 e2 ...)
                  (syntax (if e0 (begin e1 e2 ...) rest)))))))))))

(define-syntax case
  (lambda (x)
    (syntax-case x ()
      ((_ e c1 c2 ...)
       (with-syntax ((body
           (let f ((c1 (syntax c1)) (cmore (syntax (c2 ...))))
             (if (null? cmore)
                 (syntax-case c1 (else)
                   ((else e1 e2 ...) (syntax (begin e1 e2 ...)))
                   (((k ...) e1 e2 ...)
                    (syntax (if (memv t '(k ...)) (begin e1 e2 ...)))))
                 (with-syntax ((rest (f (car cmore) (cdr cmore))))
                   (syntax-case c1 ()
                     (((k ...) e1 e2 ...)
                      (syntax (if (memv t '(k ...))
                                  (begin e1 e2 ...)
                                  rest)))))))))
         (syntax (let ((t e)) body)))))))


procedure: (syntax-object->datum obj)
returns: obj stripped of syntactic information

The procedure syntax-object->datum strips all syntactic information from a syntax object and returns the corresponding Scheme "datum." Identifiers stripped in this manner are converted to their symbolic names, which can then be compared with eq?. Thus, a predicate symbolic-identifier=? might be defined as follows:

(define symbolic-identifier=?
  (lambda (x y)
    (eq? (syntax-object->datum x)
         (syntax-object->datum y))))

Two identifiers that are free-identifier=? are symbolic-identifier=?; in order to refer to the same binding, two identifiers must have the same name. The converse is not always true, since two identifiers may have the same name but different bindings.


procedure: (datum->syntax-object template-identifier obj)
returns: a syntax object

datum->syntax-object constructs a syntax object from obj that contains the same contextual information as template-identifier, with the effect that the syntax object behaves as if it were introduced into the code when template-identifier was introduced. The template identifier is often the keyword of an input form, extracted from the form, and the object is often a symbol naming an identifier to be constructed.

datum->syntax-object allows a transformer to "bend" lexical scoping rules by creating implicit identifiers that behave as if they were present in the input form, thus permitting the definition of syntactic extensions that introduce visible bindings for or references to identifiers that do not appear explicitly in the input form. For example, we can define a loop expression that binds the variable break to an escape procedure within the loop body.

(define-syntax loop
  (lambda (x)
    (syntax-case x ()
      ((k e ...)
       (with-syntax ((break (datum->syntax-object (syntax k) 'break)))
          (syntax (call-with-current-continuation
                    (lambda (break)
                      (let f () e ... (f))))))))))

(let ((n 3) (ls '()))
  (loop
    (if (= n 0) (break ls))
    (set! ls (cons 'a ls))
    (set! n (- n 1))))  (a a a)

Were we to define loop as

(define-syntax loop
  (lambda (x)
    (syntax-case x ()
      ((_ e ...)
       (syntax (call-with-current-continuation
                 (lambda (break)
                   (let f () e ... (f)))))))))

the variable break would not be visible in e ....

It is also useful for obj to represent an arbitrary Scheme form, as demonstrated by the definition of include given within the description of include below.


syntax: (include filename)
returns: unspecified

include expands into a begin expression containing the forms found in the file named by filename. For example, if the file f-def.ss contains the expression (define f (lambda () x)), the expression

(let ((x "okay"))
  (include "f-def.ss")
  (f))

evaluates to "okay". An include form is treated as a definition if it appears within a sequence of definitions and the forms on the file named by filename are all definitions, as in the above example. If the file contains expressions instead, the include form is treated as an expression.

include may be defined portably as follows, although Chez Scheme uses an implementation-dependent definition that allows it to capture and maintain source information for included code.

(define-syntax include
  (lambda (x)
    (define read-file
      (lambda (fn k)
        (let ((p (open-input-file fn)))
          (let f ((x (read p)))
            (if (eof-object? x)
                (begin (close-input-port p) '())
                (cons (datum->syntax-object k x)
                      (f (read p))))))))
    (syntax-case x ()
       ((k filename)
        (let ((fn (syntax-object->datum (syntax filename))))
          (with-syntax (((exp ...) (read-file fn (syntax k))))
            (syntax (begin exp ...))))))))

The definition of include uses datum->syntax-object to convert the objects read from the file into syntax objects in the proper lexical context, so that identifier references and definitions within those expressions are scoped where the include form appears.


procedure: (generate-temporaries list)
returns: a list of distinct generated identifiers

Transformers can introduce a fixed number of identifiers into their output by naming each identifier. In some cases, however, the number of identifiers to be introduced depends upon some characteristic of the input expression. A straightforward definition of letrec, for example, requires as many temporary identifiers as there are binding pairs in the input expression. The procedure generate-temporaries is used to construct lists of temporary identifiers.

list may be any list; its contents are not important. The number of temporaries generated is the number of elements in list. Each temporary is guaranteed to be different from all other identifiers.

A definition of letrec that uses generate-temporaries is shown below.

(define-syntax letrec
  (lambda (x)
    (syntax-case x ()
      ((_ ((i v) ...) e1 e2 ...)
       (with-syntax (((t ...) (generate-temporaries (syntax (i ...)))))
          (syntax (let ((i #f) ...)
                    (let ((t v) ...)
                      (set! i t) ...
                      (let () e1 e2 ...)))))))))

Any transformer that uses generate-temporaries in this fashion can be rewritten to avoid using it, albeit with a loss of clarity. The trick is to use a recursively defined intermediate form that generates one temporary per expansion step and completes the expansion after enough temporaries have been generated. A definition of letrec that does not use generate-temporaries is left as an exercise for the reader.


Section 9.3. Modules

Modules are used to help organize programs into separate parts that interact cleanly via declared interfaces. Although modular programming is typically used to facilitate the development of large programs possibly written by many individuals, it may also be used in Chez Scheme at a "micro-modular" level, since Chez Scheme module and import forms are definitions and may appear anywhere any other kind of definition may appear, including within a lambda body or other local scope.

Modules control visibility of bindings and can be viewed as extending lexical scoping to allow more precise control over where bindings are or are not visible. Modules export identifier bindings, i.e., variable bindings, keyword bindings, or module name bindings. Modules may be named or anonymous. Bindings exported from a named module may be made visible via an import form wherever the module's name is visible. Bindings exported from an anonymous module are implicitly imported where the module form appears. Anonymous modules are useful for hiding some of a set of bindings while allowing the remaining bindings in the set to be visible.


syntax: (module name interface defn ... init ...)
syntax: (module interface defn ... init ...)
syntax: (import name)
syntax: (import-only name)
returns: unspecified

In each case, name is an identifier, defn ... are definitions, and init ... are expressions. interface is a list of exports (export ...), where each export is either an identifier identifier or of the form (identifier export ...).

The first syntax for module establishes a named scope that encapsulates a set of identifier bindings. The exported bindings may be made visible via import or import-only anywhere the module name is visible. The second syntax for module introduces an anonymous module whose bindings are implicitly imported (as if by import of a hidden module name) where the module form appears.

A module consists of a (possibly empty) set of definitions and a (possibly empty) sequence of initialization expressions. Each identifier listed in a module's interface must be defined within that module. Because module and import are definitions, they may appear at the top level of a program, nested within the bodies of lambda expressions, and nested within other modules. Also, because module names are scoped like other identifiers, modules may export module names as well as variables and keywords.

When an interface contains an export of the form (identifier export ...), only identifier is visible in the importing context. The identifiers within export ... are implicit imports that are visible only within the expansion of syntactic abstractions exported from the module. This form is thus useful only when identifier is a keyword naming a syntactic abstraction. In the current implementation, implicit imports need be listed only when identifier is exported by a top-level module, although implicit imports may be listed in other contexts as well. Implicit exports are listed so that the compiler can determine the exact set of bindings (explicit and implicit) that must be inserted into the top-level environment, and conversely, the set of bindings that may be treated more efficiently as local bindings.

Module names occupy the same namespace as other identifiers and follow the same scoping rules. Unless exported, identifiers defined within a module are visible only within that module. Identifiers exported from a module are visible within the module and where the module is imported. An identifier made visible via an import of a module is scoped as if its definition appears where the import occurs. The following example illustrates these scoping rules.

(let ((x 1))
  (module m (x setter)
    (define-syntax x (identifier-syntax z))
    (define setter (lambda (x) (set! z x)))
    (define z 5))
  (let ((y x) (z 0))
    (import m)
    (setter 3)
    (+ x y z)))  4

The inner let expression binds y to the value of the x bound by the outer let. The import of m makes the definitions of x and setter visible within the inner let. In the expression (+ x y z), x refers to the identifier macro exported from m while y and z refer to the bindings established by the inner let. The identifier macro x expands into a reference to the variable z defined within the module.

Expressions within a module can reference identifiers bound outside of the module.

(let ((x 3))
  (module m (plusx)
    (define plusx (lambda (y) (+ x y))))
  (import m)
  (let ((x 4))
    (plusx 5)))  8

Similarly, import does not prevent access to identifiers that are visible where the import form appears, except for those variables shadowed by the imported identifiers.

(module m (y) (define y 'm-y))
(let ((x 'local-x) (y 'local-y))
  (import m)
  (list x y))  (local-x m-y)

On the other hand, import-only establishes an isolated scope in which the only visible identifiers are those exported by the imported module.

(module m (y) (define y 'm-y))
(let ((x 'local-x) (y 'local-y))
  (import-only m)
  x)  Error: x is not visible

This is sometimes desirable for static verification that no identifiers are used except those explicitly imported into a module or local scope.

Unless a module imported via import-only exports import or import-only and the name of at least one module, subsequent imports within the scope of the import-only form are not possible. To create an isolated scope containing the exports of more than one module without making import or import-only visible, it is necessary to create a single module that contains the exports of each of the other modules.

(module m2 (y) (define y 'y))
(module m1 (x) (define x 'x))
(module mega-module (cons x y)
  (import m1)
  (import m2)
  (import scheme))
(let ((y 3))
  (import-only mega-module)
  (cons x y))  (x . y)

Before it is compiled, a source program is translated into a core language program containing no syntactic abstractions, syntactic definitions, module definitions, or import forms. Translation is performed by a syntax expander that processes the forms in the source program via recursive descent.

A define-syntax form associates a keyword with a transformer in a translation-time environment. When the expander encounters a keyword, it invokes the associated transformer and reprocesses the resulting form. A module form associates a module name with an interface. When the expander encounters an import form, it extracts the corresponding module interface from the translation-time environment and makes the exported bindings visible in the scope where the import form appears.

Internal definitions and definitions within a module body are processed from left to right so that a module's definition and import may appear within the same sequence of definitions. Expressions appearing within a body and the right-hand sides of variable definitions, however, are translated only after the entire set of definitions has been processed, allowing full mutual recursion among variable and syntactic definitions.

Module and import forms affect only the visibility of identifiers in the source program, not their meanings. In particular, variables are bound to locations whether defined within or outside of a module, and import does not introduce new locations. Local variables are renamed as necessary to preserve the scoping relationships established by both modules and syntactic abstractions. Thus, the module program given above is equivalent to the following program in which identifiers have been consistently renamed as indicated by subscripts.

(let ([x0 1])
  (define-syntax x1 (identifier-syntax z1))
  (define setter1 (lambda (x2) (set! z1 x2)))
  (define z1 5)
  (let ([y3 x0] [z3 0])
    (setter1 3)
    (+ x1 y3 z3)))

Although definitions within a lambda or module body are processed from left to right by the expander, the order of evaluation of variable definitions is unspecified. Initialization expressions appearing within a module body are evaluated in sequence after the evaluation of the variable definitions.

The remainder of this section presents a few examples of module use adapted from the paper "Extending the scope of syntactic abstraction" [24], which describes modules and their implementation in more detail.

It is often convenient to refer to one export of a module without importing all of its exports. While the module system does not provide an explicit construct for this purpose, a qualified reference form can easily be defined and used as follows.

(define-syntax from
  (syntax-rules ()
    ((_ m id) (let () (import-only m) id))))

(let ((x 10))
  (module m1 (x) (define x 1))
  (module m2 (x) (define x 2))
  (+ (from m1 x) (from m2 x)))  3

This form works for variables but not for exported keywords or module names, since the output is an expression and may thus appear only where expressions may appear. A generalization of this technique is used in the following definition of import*, which supports renaming of imported bindings and selective import of specific bindings.

(define-syntax import*
  (syntax-rules ()
    [(_ m) (begin)]
    [(_ m (new old))
     (module ((new tmp))
       (define-alias new tmp)
       (module (tmp)
         (import m)
         (define-alias tmp old)))]
    [(_ m id) (module (id) (import m))]
    [(_ m spec0 spec1 ...)
     (begin (import* m spec0)
            (import* m spec1 ...))]))

To selectively import an identifier from module m, the import* form expands into an anonymous module that first imports all exports of m then re-exports only the selected identifier. To rename on import the macro expands into an anonymous module that instead exports an alias bound to the new name. The alias is simply an identifier macro that expands into a reference to the old name visible where the alias was defined.

If the output placed the definition of new in the same scope as the import of m, a naming conflict would arise whenever new is also present in the interface of m. To prevent this, the output instead places the import within a nested anonymous module and links old and new by means of an alias for the introduced identifier tmp. Because tmp is referenced in the expansion of new (see the definition of define-alias below), it is listed as an implicit export.

The macro expands recursively to handle multiple import specifications. Thus the following construct imports x, y (renamed as z), and z (renamed as y) from module m.

(import* m x (y z) (z y))

define-alias is defined in terms of identifier-syntax.

(define-syntax define-alias
  (syntax-rules ()
    [(_ x y)
     (define-syntax x
       (identifier-syntax y))]))

Mutually recursive modules can be defined in several ways. In the following program, a and b are mutually recursive modules exported by an anonymous module whose local scope is used to statically link the two. For example, the free variable y within module a refers to the binding for y, provided by importing b, in the enclosing module.

(module (a b)
  (module a (x) (define x (lambda () y)))
  (module b (y) (define y (lambda () x)))
  (import a)
  (import b))

The following syntactic abstraction generalizes this pattern to permit the definition of multiple mutually recursive modules.

(define-syntax rec-modules
  (syntax-rules (module)
    ((_ (module m (id ...) form ...) ...)
     (module (m ...)
       (module m (id ...) form ...) ...
       (import m) ...))))

Because a module can re-export imported bindings, it is quite easy to provide multiple views on a single module, as s and t provide for r below, or to combine several modules into a compound, as r does.

(module p (x y)
  (define x 1) (define y 2))
(module q (y z)
  (define y 3) (define z 4))
(module r (a b c d)
  (import* p (a x) (b y))
  (import* q (c y) (d z)))
(module s (a c) (import r))
(module t (b d) (import r))

To allow interfaces to be separated from implementations, the following syntactic abstractions support the definition and use of named interfaces.

(define-syntax define-interface
  (syntax-rules ()
    [(_ name (export ...))
     (define-syntax name
       (lambda (x)
         (syntax-case x ()
           [(_ n defs)
            (with-implicit (n export ...)
              #'(module n (export ...) .
                  defs))])))]))

(define-syntax define-module
  (syntax-rules ()
    [(_ name interface defn ...)
     (interface name (defn ...))]))

define-interface creates an interface macro that, given a module name and a list of definitions, expands into a module definition with a concrete interface.

with-implicit, used here to ensure that the introduced export identifiers are visible in the same scope as the name of the module in the define-module form, is implemented in terms of datum->syntax-object.

(define-syntax with-implicit
  (lambda (x)
    (syntax-case x ()
      ((_ (tid id ...) exp ...)
       (andmap identifier? #'(tid id ...))
       #'(with-syntax ((id (datum->syntax-object #'tid 'id)) ...)
           exp ...)))))

define-interface and define-module can be used as follows.

(define-interface simple (a b))
(define-module m simple
  (define-syntax a (identifier-syntax 1))
  (define b (lambda () c))
  (define c 2))
(let () (import m) (+ a (b)))  3

The abstract module facility defined below allows a module interface to be satisfied incrementally when module forms are evaluated. This permits flexibility in the separation between the interface and implementation, supports separate compilation of mutually recursive modules, and permits redefinition of module implementations.

(define-syntax abstract-module
  (syntax-rules ()
    ((_ name (ex ...) (kwd ...) defn ...)
     (module name (ex ... kwd ...)
       (declare ex) ...
       defn ...))))

(define-syntax implement
  (syntax-rules ()
    ((_ name form ...)
     (module () (import name) form ...))))

Within an abstract-module form, each of the exports in the list ex ... must be variables. The values of these variables are supplied by one or more separate implement forms. Since keyword bindings must be present at compile time, they cannot be satisfied incrementally and are instead listed as separate exports and defined within the abstract module.

Within an implement form, the sequence of forms form ... is a sequence of zero or more definitions followed by a sequence of zero or more expressions. Since the module used in the expansion of implement does not export anything, the definitions are all local to the implement form. The expressions may be arbitrary expressions, but should include one satisfy form for each variable whose definition is supplied by the implement form. A satisfy form has the syntax

(satisfy variable expr)

declare and satisfy may simply be the equivalents of define and set!.

(define-syntax declare (identifier-syntax define))
(define-syntax satisfy (identifier-syntax set!))

Alternatively, declare can initialize the declared variable to the value of a flag known only to declare and satisfy, and satisfy can verify that this flag is still present to insure that only one attempt to satisfy the value of a given identifier is made.

(module ((declare cookie) (satisfy cookie))
  (define cookie "chocolate chip")
  (define-syntax declare
    (syntax-rules ()
      ((_ var) (define var cookie))))
  (define-syntax satisfy
    (syntax-rules ()
      ((_ var exp)
       (if (eq? var cookie)
           (set! var exp)
           (error 'satisfy
             "value of variable ~s has already been satisfied"
             'var))))))

Using abstract-module and implement, we can define mutually recursive and separately compilable modules as follows.

(abstract-module e (even?) (pred)
  (define-syntax pred
    (syntax-rules () ((_ exp) (- exp 1)))))

(abstract-module o (odd?) ())

(implement e
  (import o)
  (satisfy even?
    (lambda (x)
      (or (zero? x) (odd? (pred x))))))

(implement o
  (import e)
  (satisfy odd?
    (lambda (x) (not (even? x)))))

(let () (import-only e) (even? 38))  #t


Section 9.4. Built-in Modules

Five modules are built-in to Chez Scheme: scheme, r5rs, r5rs-syntax, ieee, and \#system. Each module is immutable, i.e., the exported bindings cannot be altered.


module: scheme

scheme contains all user-visible top-level bindings (variables, keywords, and module names) built into Chez Scheme.


module: r5rs

r5rs contains all top-level bindings (variables and keywords) defined in the Revised5 Report on Scheme. The bindings exported from r5rs are precisely those that are available within an expression evaluated via eval with the environment specifier returned by scheme-report-environment.


module: r5rs-syntax

r5rs-syntax contains all top-level keyword bindings defined in the Revised5 Report on Scheme. The bindings exported from r5rs-syntax are precisely those that are available within an expression evaluated via eval with the environment specifier returned by null-environment.


module: ieee

ieee contains all top-level bindings (variables and keywords) defined in the ANSI/IEEE standard for Scheme. The bindings exported from ieee are precisely those that are available within an expression evaluated via eval with the environment specifier returned by ieee-environment.


module: \#system

\#system contains all user-visible top-level bindings built into Chez Scheme along with various undocumented system bindings.


Chez Scheme User's Guide
© 1998 R. Kent Dybvig
Cadence Research Systems
http://www.scheme.com
Illustrations © 1998 Jean-Pierre Hébert
about this book