This is the archived website of SI 413 from the Fall 2012 semester. Feel free to browse around; you may also find more recent offerings at my teaching page.
This lab is due at 0900 next Tuesday, November 20.
It should contain all the files in the starter code listed below,
as well as a subfolder called tests
with your tests.
See the submit page for more info.
The starter code for this week's lab is... the solution to last week's lab. You can stick with your own solution or use mine (or some combination thereof).
Today's lab is all about garbage collection. In it, we will be using a few C++ features that we haven't seen yet: static methods, static fields, constructors, and deconstructors. Here's a brief overview on each.
Also sometimes called class methods, these are methods
in a class that are not called on any particular object of that
class. They have access to any static fields in the class (see below),
but not any of the other fields, and they also have no
this
pointer.
You declare a static method by using the word static
:
class SomeClass { private: ... public: static int foo() { /* body of the method goes here. */ } };
Static fields are data members that exist only once for the entire class, not for each individual object. Really, they are just dressed-up global variables.
The tricky thing with static fields is that they are declared in the class declaration in the header file, as usual, but they must also be declared, and possibly initialized, in some cpp file. So for example we might have
class SomeClass { private: static bool x; public: ... };
This would go in the header file. Then, in the cpp file, we have to initialize the static member as follows:
bool SomeClass::x = true;
Okay, we've been using constructors for a while now, so you should know what they look like. They're just like any other (non-static) class method, except that (1) there is no return type, and (2) the name is the same as the name of the class. For instance:
class SomeClass { public: SomeClass (int arg) { /* This is a constructor. */ } };
We haven't yet seen destructors. These are called whenever an object
is de-allocated at run time. Remember this can happen either when a local
variable goes out of scope (i.e., the function or block it was declared in
comes to an end), or when an object allocated using new
is
explicitly deallocated with a call to delete
.
A destructor is again like any other class method, except that it has
no return type, no arguments, and the name is a tilde (~
)
followed by the class name. For example:
class SomeClass { public: ~SomeClass() { /* This is a destructor. */ } };
(You won't necessarily have to write any destructors for this lab, I just wanted to make sure you know what they are.)
To help see why we really need automatic garbage collection in our SPL interpreter, let's see just how many frames are being created in our program. You'll do this by storing a big list of all the Frames that get created, as well as a way to access the size of this list from within an SPL program.
First thing to do is modify the Frame class to keep track of all Frames that get created in a list. This requires a few additions to the class:
list<Frame*>
that will contain a pointer to every Frame object that is currently allocated
by the interpreter. You will have to
#include <list>
at the beginning of frame.hpp
to be able to use this built-in
data structure. Then you'll need a declaration like
static list<Frame*> allFrames;inside the Frame class, to declare the static member. Finally, you have to initialize this member somewhere outside the class. That means adding a line like
list<Frame*> Frame::allFrames;at the beginning of the
spl.ypp
parser file, in the same place
the other global variables are initialized.
push_back
method of the list
class; see
here
or here
for more complete reference.
After these modifications, your SPL interpreter should work just fine, but you have no way of telling that it's doing anything differently. We want a function that can be called from within a SPL program to determine the number of Frames currently allocated in memory. Sounds like a perfect candidate for a built-in! Go back and review Lab 10 if you need help making another built-in function, or look at the starter code for this week's lab.
After you get this working, you should try some examples to see how many frames are created. You can of course run whatever examples you like, but here's a fun one that I made: a rather inefficient function to compute the prime factorization of any positive integer:
# Takes an integer and prints out all its prime factors new factors := lambda n { # Produces the remainder when the first argument is divided by the second # (curried function) new remainder := lambda a { ret := lambda b { ret := a - a/b*b; }; }; # Returns the smallest (prime) factor of n, where n >= 2. new smallest_factor := lambda n { new helper := lambda i { ifelse remainder@n@i = 0 { ret := i; } { ret := helper@(i+1); } }; ret := helper@2; }; if n > 1 { new p := smallest_factor@n; write p; factors@(n/p); } };
Frame
class
as described above to keep track of allocated frames in a list.
There are no tests for this, but to help yourself debug, you might want to add,
say, a debugging statement to the Frame constructor that prints out the
current size of the list to cerr
.
frames
that ignores its argument and
returns the number of Frames in the list at that moment. (To emphasize, your
built-in must take one argument, since that's how SPL works. It just
shouldn't do anything with the argument.)spl>You definitely could write tests for this part, but don't. The reason is that we're about to implement garbage collection, so all the counts are going to change (for the better)! So no tests to submit for this exercise either.write frames@0;
3 spl>new sum := lambda n {
ifelse n=1
{ret := 1;}
{ret := n + sum@(n-1);}
}; spl>sum@5;
spl>write frames@0;
20 spl>sum@100;
spl>write frames@0;
322
Now we want to actually implement automatic garbage collection for our SPL interpreter. The basic idea is to add the following to the Frame class:
mark
to mark this Frame and anything
this Frame points to.sweep
to delete and remove from the list
any un-marked Frames.The ultimate goal (the exercise is listed below) is to have automatic mark-and-sweep garbage collection happen after every statement is executed by our interpreter. Once this is working, the number of Frames in memory after every step should shrink down to some very small number every time. You can proceed however you like; however, I recommend the following steps:
bool
to the Frame class
to indicate
whether this frame has been marked. Add a line to the constructor to
initialize this field to false
.mark
that
performs the marking part of mark-and-sweep, starting with this Frame.
If the this frame is already marked, then just return without doing anymore
work. Otherwise, you want to
set the boolean field to true
, and make a recursive call on
the parent frame (if it isn't NULL
) and on the
environment
Frame of any Closure that is in
the bindings
of this Frame.
To do this, you will need to loop through all the key-value pairs in
bindings
like this:
map<string,Value>::iterator iter = bindings.begin(); while (iter != bindings.end()) { /* in here, "iter->second" is the Value. * You see, *itr is a pair<string,Value>, i.e., a * string-Value pair from the map. ".second" gives the second * element of the pair --- the Value. */ ++iter; // Increments the iterator to go to the next pair. }
sweep
. This is going to
go through all the Frame pointers in the list from step (i) and do one of
two things:
bool
from step (i) to false
).delete
on it
to de-allocate its memory, and remove its entry from the list.bindings
,
except that now *iter
will be of type Frame*
rather than some kind of pair. Then you want to use the erase
method in the list
class (again, see references
here
or here)
to remove list items as necessary. Careful: Your iterator will point
to nonsense after you erase the corresponding entry. Luckily, the
erase
method returns an iterator to the next element
in the list. But then you don't want to increment the iterator in this case!
mark
on the global frame, and then
sweep
.main
method,
for "interactive" and "non-interactive" usage. When I test your code, it runs
the "non-interactive" version. So you will actually have to add the
mark and sweep calls in two different places. It should be just after
the call to tree->exec
in both loops. Ask your instructor if
you need help finding this!
Great! Now that you have garbage collection, you can run a series of fairly
computationally-intensive operations like (using my example SPL function above)
factors@65432
over and over again without ever running out of
memory.
But if you try to do it a few times in a block, or in a function, or loop, etc., the garbage collector never gets called and your program will run out of memory. Why? The problem is that garbage collection is currently only triggered from the top-level in spl.ypp. So now we want to improve on this by triggering garbage collection from within function calls, blocks, loops, and the like, whenever it is needed.
The tricky part is what we called the "root set" in class. See, it's no longer sufficient just to mark all the Frames that are reachable from the global frame, because the frame that we're currently executing in also needs to be considered. So we have to keep track of all the currently active frames in some sort of stack, and mark everything reachable from any of these frames before sweeping.
This is an optional exercise. Make sure you have completed AND SUBMITTED a working version of everything else before continuing. This will break your solution to exercise 3, so no skipping allowed!
Here are a few hints on how you might proceed:
vector
to hold my root set, but you
may do as you wish.)
mark
for every Frame in the root set, before sweeping.
But where should this method be called from? You need to be really careful here
that everything important gets marked. I would recommend just calling it
in the Block::eval method, which should be good enough.