Summary
-
How a program keeps track of variable bindings.
-
Frames: The “boxes” that hold bindings.
-
Environments: The “chains” of frames.
-
Closures: The “bundles” of code and environment that make it all work.
Intro: Where Do Variables Live?
Last class we introduced the idea of closures and lexical scope, and saw some examples in various languages.
Today we will start to think about how to implement this idea in an interpreter or compiler.
When we write a program, we define variables. For instance in Scheme:
(define x 10)
(define my-func (lambda (y) (+ x y)))
(my-func 20)
When my-func is executing, how does it know that y is 20? And how does it know that x is 10? The system needs a formal way to keep track of all these “bindings” (variable names to their values).
This system is called the Environment Model of Evaluation.
The Environment Model
We can visualize this system using “boxes and balls” diagrams.
(These diagrams are shamelessly copied from chapter 3.2 of the classic textbook SICP by Abelson, Sussman, and Sussman.)
a) Frames: The “Boxes”
-
A Frame is a data structure (a “box”) that holds a set of bindings.
-
Each binding pairs a variable name (a symbol) with a value.
-
Think of a frame as a single dictionary or hash map.
-
There is one “special” frame called the Global Frame (or Global Environment). This is where all the built-in functions (
+,*,cons, etc.) and our top-leveldefines live.
b) Environments: Chains of Frames
-
An Environment is a sequence of frames.
-
We can visualize this as a “chain” of boxes.
-
Every environment (except the global one) has a pointer to its enclosing environment (its “parent” box).
-
Variable Lookup Rule: When the program needs the value of a variable (e.g.,
x):
-
It looks in the current frame.
-
If not found, it follows the pointer to the enclosing (parent) frame and looks there.
-
It repeats this process, “walking the chain,” until it finds the variable or reaches the Global Frame.
-
If it’s not found in the Global Frame, it’s an “unbound variable” error.
c) Closures: Code plus Lexical Environment
-
A Closure is a pair of pointers: to the code of a function, and its referencing environment, i.e., a frame
-
The environment should be the one where the function was defined (this is what gives us lexical scoping)
Function calls
Rule: Every time a function is called, a new frame is created.
-
This new frame is used to store the function’s parameters (bound to the arguments), plus any local variables created in the function itself.
-
The parent of the new frame is set according to the environment in the closure
Make sure you understand the second point — that is where the magic happens!
The new frame’s enclosing (parent) environment is the environment stored in the closure. This means the new frame is linked to the environment where the function was defined, not the environment where the function was called. This is why it’s called “lexical” or “static” scoping. The “scope” (the chain of frames) is determined by where the code is written in the file, not by the runtime call stack.
Example: make-adder in Scheme
Let’s trace this classic higher-order function.
(define make-adder
(lambda (x)
(lambda (y)
(+ x y))))
(define add-5 (make-adder 5))
(add-5 10) ; Result: 15
((make-adder 20) 6) ; Result: 26
Here is the resulting diagram of the five frames and three closures that get created when running the code above:
Execution trace for make-adder
-
Global Frame: We start with the Global Frame.
-
We define
make-adder. This creates the first closure and binds the namemake-adderto it in the Global Frame.Notice that this closure points back to the global frame for its referencing environment, since that is where the
make-adderfunction was defined. -
The environment for the function call
(make-adder 5)is created.-
This is a function call, so we create a new frame
-
The title of the frame
(make-adder 5)is just for us humans looking at it; this part is not really important in the running interpreter. -
The parent frame is set according to the closure. That points to the global frame, so the parent of this new frame is also the global frame.
-
We put bindings in the new frame for any function parameters (from the function def) and values (from the function call). Here the parameter
xis boudn to the value5.
-
-
Now the actual function call
(make-adder 5)is executed-
This will execute the body of
make-adderfrom the first part of the closure. -
In this code, it calls
(lambda (y) ...)to create a new closure -
The new closure points to the body of the inner lambda, and the current frame.
-
(A pointer to) this closure is returned from the call to
(make-adder)
-
-
The return value (a closure) as assigned to the name
add-5in the global frame.-
Now the global frame has two bindings, one for
make-adderand one foradd-5. -
The function call
(make-adder-5)has returned, but the frame it created has not gone away! It is still available indirectly by callingadd-5.
-
-
The next function call
(add-5 10)is called-
First, the environment is set up: We get a new frame (on the bottom-left of the diagram), whose parent frame is the one indicated by the closure that
add-5points to, and with a binding for the parameterybeing set to10. -
Then the code of this function is executed in that environment.
-
The code being executed is
(+ x y). In order to execute this, the values of bothxandymust be looked up in the current environment. -
The value of
xis found in the parent frame, and the value ofyis in the current frame.
-
-
The last line
((make-adder 20) 6)is executed.-
This starts by executing the inner call
(make-adder 20). This is the same as executing(make-adder 5)above: a new frame is created under the global frame and with a binding forx, and a new closure is created in this frame and returned. -
Notice that that closure points to the same code as the one for
add-5, but a different environment (with a different binding forx, crucially!) -
This returned closure is not given a name like
add-5was, but instead is immediately executed with argument6. -
That second function call creates a new frame (bottom-right of the diagram), with the parent pointer set accordingly from the closure, and then the same code
(+ x y)is executed in this different environment, returning26. -
In summary, calling
((make-adder 20 6)resulted in creating two new frames and one new closure.
-
Being able to draw these diagrams is important because this models exactly what is happening inside the interpreter as the program executes.
Make sure you understand:
-
What causes a new frame to be created?
-
What causes a new closure to be created?
-
When a new frame is created, how do we decide what its parent frame should be?
-
When a variable reference occurs, how do we look up the value in the current environment?