Unit 3: Advanced Scheme

This is the archived website of SI 413 from the Fall 2012 semester. Feel free to browse around; you may also find more recent offerings at my teaching page.

Required reading:
- These notes
- Structure and Intterpretation of Computer Programs (SICP) (available free online), Sections 1.3 and 3.1
- Programming Language Pragmatics (PLP), Sections 6.6.1 and 10.7
Recommended reading:
- SICP, Section 5.4.2
- PLP, Sections 10.5, 10.6, and 10.8
Code from these notes: u03.scm
Slides: note-taking version
Homework for this unit: HW 3, HW 4

Beginning of Class 5

1 Pure Functional Programming

Readings for this section: PLP, Sections 10.7 (required) and 10.8 (recommended).

Remember there are two important characteristics of a "pure" functional programming language:

Referential Transparency. This fancy term just means that, for any expression in our program, the result of evaluating it will always be the same.

In fact, any referentially transparent expression could be replaced with its value (that is, the result of evaluating it) without changing the program whatsoever.

Notice that imperative programming is about as far away from this as possible. For example, consider the C++ for loop:
```
for (int i = 0; i < 10; ++i) {
  /* some stuff */
}
```
What can we say about the "stuff" in the body of the loop? Well, it had better not be referentially transparent. If it is, then there's no point in running over it 10 times!
Functions are First Class. Another way of saying this is that functions are data, just like any number or list. Functions are values, in fact! The specific privileges that a function earns by virtue of being first class include:
1. Functions can be given names.
  
  This is not a big deal; we can name functions in pretty much every programming language. In Scheme this just means we can do
```
(define (f x) (* x 3))
```
2. Functions can be arguments to other functions.
  
  This is how the built-in "higher-order functions" like map and filter work: one of their arguments is another function. Functions that you write can also take procedures as arguments, just like any other argument.
  
  This is neat, but again doesn't make Scheme that special. Passing a function pointer to another function is already possible in C, for example, which basically accomplishes the same thing, albeit with a much more annoying syntax.
3. Functions can be returned by functions.
  
  This means that, when functions are first-class, we can make "function factories", functions that produce other functions. Again, in many languages (even C) it is possible to return other pre-existing functions, but functional languages go one step further to allow the return of nameless functions. That is, functions can be created on the fly and returned. We'll see what this is about very shortly.
4. Functions can be stored in data structures.
  
  When functions are first-class, we can make lists or even trees of functions, just like a list of numbers or any other data structure. Again, remember that first-class status means that functions are data. In fact, later in this unit, we'll see how functions can actually represent entire objects like we would in Java or C++.

2 Lambda

Readings for this section: SICP Section 1.3 (required), and PLP Sections 10.5 and 10.6 (recommended).

The Lambda Calculus is a way of expressing mathematics and computation, all with functions, developed by Alonzo Church in the 1930s. It was actually a competing idea with Alan Turing's universal (Turing) machine for the best way to express what is computable. The famous "Church-Turing Thesis", which is one of those things that everyone believes but is still unproven, states that the two models are actually equivalent. Since we now associate all computation with Turing machines, this basically says that every kind of computation can be expressed by some kind of composition of (possibly recursive) functions.

Math geekery aside, the basic principles of Church's Lambda Calculus find their way into modern programming languages like Scheme: Functional recursion is important and useful, functions are data, and not every function needs to have a name.

To honor this mathematical tradition, many programming languages that allow the creation of nameless functions use the keyword (or function name) lambda to do it. For example, let's say we want to create a function that takes an argument x and adds 3 to it. Here's how you would write that in Python:

lambda x: x + 5

Or in Haskell:

x -> x + 5

Or in C#:

x => x + 5

And, finally, in Scheme (or Lisp):

(lambda (x) (+ x 5))

Now if you type this last one into DrScheme, what you get back is just #<procedure>, which is the interpreter telling you that this is a procedure (and it has no name).

To actually call this function, we do it the same way we would any other function in Scheme: by putting it first after an open parenthesis, followed by its argument:

((lambda (x) 
   (+ x 5))
 8)

That function call now produces the answer, 13. I broke it up for you to see the structure, but we could of course also write it on a single line.

Now with this simple example, it's hard to see the utility of lambda and "anonymous" (i.e., nameless) functions in general. After all, the last expression above could certainly be accomplished more easily just by writing (+ 8 5)!

Well, we will see many examples of the power of lambda, but one useful purpose is in combination with higher-order functions like map and apply. For example, we can use the function above to add 5 to every element in a list:

(map (lambda (x) (+ x 5))
     (list 3 5 8 10))

Before we would have needed a recursive function (or in another language, a for loop) to accomplish the same thing. Now we can accomplish the same thing, with a very nice separation of the action (adding 5) from the data (the actual list).

One last note here on lambda: it lies beneath some of the Scheme code you've already been writing! For example, when you define a function like

(define (foo x y)
  (* (+ x 1) (- y 1)))

this is really equivalent to defining foo as a constant lambda-expression:

(define foo
  (lambda (x y)
    (* (+ x 1) (- y 1))))

You might be even more startled to discover that a let expression is also just a lambda in disguise. Think about it: let is just saying "associate these names with these values, and then evaluate this expression". Well that's precisely what it means to call a function! For example, this let expression:

(let ((x 8)
      (y 9))
  (+ (sqr x) (sqr y)))

is exactly the same as:

((lambda (x y)
   (+ (sqr x) (sqr y)))
 8
 9)

Really! They are equivalent! The argument list in the lambda is just the names of all the let variables, and the actual arguments are the values that each let variable gets assigned to.

These translations are not just for our entertainment, they are actually how the Scheme interpreter works. So function definitions are just a shorthand for a constant definition of a lambda expression, and let statements are a (very convenient) shorthand for a lambda evaluation. There is a term for this sort of thing: syntactic sugar. We say that function defines and lets are "syntactic sugar" in Scheme because they are convenient (and sweet!), but don't really provide any new capabilities to the language. In other words, they are empty calories. Delicious, useful, wonderful empty calories.

Beginning of Class 6

3 Side Effects

Readings for this section: SICP, Section 3.1 (required).

Remember that one of the key principles of "pure" functional programming is referential transparency. What this means specifically is that any function call (or any expression) in our program could be replaced with the value that results from that function call, without changing the program at all.

Informally, what referential transparency means is that there are no side effects to any computation. Every function returns some value, and that is the only thing that happens. Nothing else changes. In particular, if we made the same function call twice, we are just wasting our time because the result will be identical.

Some programming languages, such as Haskell, enforce referential transparency completely. You must do pure functional programming in such languages. As a result, they can perform a number of optimizations to the code that would not be possible if side effects were happening. But on the other hand, doing some seemingly simple tasks suddenly becomes complicated because of the restrictions in the language.

Scheme is quite so pure as that. While Scheme is primarily designed to be a functional programming language, you can break the rules if you really want to. Now I'm going to tell you how to do it. This doesn't mean that you should break referential transparency all the time in your Scheme programs, just that you can if you have a really good reason.

3.1 Printing

The simplest kind of side-effect is printing some text to the screen. Now it is important to distinguish this from the return value of a function, which DrScheme also displays on the screen. Actually, DrScheme makes this distinction by formatting the output you ask for explicitly in purple, as opposed to the return value of the function which is printed in black.

Here are the important functions for printing in Scheme:

(display X): Displays X just like the interpreter would print it. Note that X could really be anything: a number, a symbol, a list, ...
(newline): Prints a line break
(printf "string" args...): Works similarly to the printf from C that you know and love. The difference is that instead of having to do %d or %s or %u for all the different types of arguments, you only need to do ~a for any kind of argument.

Try `(printf "Homework ~adue ~a" 4 'tomorrow) for an example.

3.2 Extra Control Constructs

Now that we can break referential transparency, some other basic aspects of Scheme also get screwed up. To illustrate the difficulty, say we want to write a function that prints every element in a list:

(define (print-list L)
  (if (null? L)
      ; WHAT TO DO FOR BASE CASE???
      (display (car L))
      (newline)
      (print-list (cdr L))
      ; HOW TO GET ALL THIS TO HAPPEN IN THE ELSE???
      ))

There are two problems here. First, what should we do in the base case? We don't want to print anything, but Scheme requires that there is a return value. In this case, we want to return the same thing that the built-in printing functions above do, which is (void). This is different than null! One key difference is that (void) is not printed out by DrScheme, where as null is displayed as '().

The second issue is that we want to do more than one thing in the else case. When we have referential transparency, there's never any cause for this, because we aren't really doing anything other than computing the answer. But to allow for sequential execution with side-effects, Scheme has the begin special form. This just evaluates each of its arguments in turn, and returns the value of the last expression.

Now we can actually write the print-list function:

(define (print-list L)
  (if (null? L)
      (void)
      (begin (display (car L))
             (newline)
             (print-list (cdr L)))))

Beginning of Class 7

4 Efficiency

Readings for this section: PLP Section 6.6.1 (required), and SICP Section 5.4.2 (recommended).

In any programming language, it is very important to be able to accomplish tasks efficiently. In a low-level language like C, this is always going to be possible, although it might require some more programming work to make it happen. But in high-level languages, especially declarative languages where the translation to machine code is not as obvious, we still have to make sure that our code is fast.

4.1 Memoization

Consider the classic example of the Fibonacci function:

(define (fib n)
  (if (<= n 1)
      n
      (+ (fib (- n 1))
         (fib (- n 2)))))

This implementation is correct, but really really slow. For example, try computing (fib 40). The problem is that there are two recursive calls, resulting in an exponential explosion of the running time.

But think about it - how many recursive calls do we really need to compute (fib 40)? All we really need is the 40 previous Fibonacci values. If we could just remember them when they get computed...

Well that's exactly what the technique called memoization gives us. The basic idea (recall from algorithms class) is that we will save the results of every function call in a hash table. Then whenever a new call is made, we first check to see if the result is already stored, and if so just return it and avoid any computation.

Here's how we could do that in Scheme for the Fibonacci function. You'll notice that a global variable is used for the hash table. Oh yeah, there are hash tables in Scheme... did I mention that? Of course, they also destroy referential transparency. Do you see how?

(define fib-hash (make-hash))
 
(define (fib-memo n)
  (cond [(not (hash-has-key? fib-hash n))
         (hash-set! 
          fib-hash
          n
          (if (<= n 1)
              n
              (+ (fib-memo (- n 1))
                 (fib-memo (- n 2)))))])
  (hash-ref fib-hash n))

4.2 Tail Recursion

Let's say that, for some reason, we want to sum up the squares of all the numbers from 1 up to n, for any n. Here's a decent way to write that function in Scheme:

(define (ssq n)
    (if (= n 0)
        0
        (+ (sqr n) (ssq (- n 1)))))

Now doing something like (ssq 10) or even (ssq 1000) works fine. But if we want to go really big, like (ssq 4000000), the interpreter runs out of memory.

Now 4 million is a big number, so maybe this is just impossible for a computer. But if you wrote this function in Python, like

def ssq(n):
    sum = 0
    i = 1
    while i <= n:
        sum = sum + i*i
        i = i + 1
    return sum

and then called ssq(4000000), it would return the result in less than a second and not use more than a few kilobytes of memory. So what's going on? Is Scheme inherently slower than other languages - even other interpreted languages like Python - on this kind of problem?

The problem with our Scheme function is that it is recursive, and there is a memory overhead for each recursive call. As each call is made, a new activation frame is pushed onto the stack with information such as the argument values. And this is absolutely necessary, since the computation has to successively pop all these values off the stack and then add (sqr n) to them before returning to the next level.

It would seem that the imperative programming way of storing the running sum in a local variable is the "right" way to do this, but how could we do this in Scheme, with referential transparency? The answer is to build this local accumulator variable into the recursive function as an extra argument. In each recursive call, this argument will keep track of the running sum, or whatever else we're computing. Here's how the previous function could work:

(define (ssq-helper n accum)
  (if (= n 0)
      accum
      (ssq-helper (- n 1)
                  (+ (sqr n) accum))))
 
(define (ssq n) (ssq-helper n 0))

See the difference? In the new helper function, there is nothing to do after the recursive call returns. This means that the stack space for this recursive call can be re-used, without having to wait for the whole computation to finish.

More formally, we say that the recursive call in the ssq-helper function is in tail position. Tail position means the last statement of a let, define, or begin, or any clause of an if or cond, or a few other things. Since the ssq-helper recursive call is in tail position within the if, and the if is in tail position in the define, the recursive call is in tail position in the entire function. And when every recursive call is in tail position, the function is said to be tail recursive.

So what? Well this means that the Scheme interpreter can optimize this function call so that a bunch of extra stack space is not actually used for the recursive calls. Essentially, the Scheme interpreter will turn our recursive-looking code (tail recursive, that is) into the iterative code that we might write in C++ or Java. This is called tail call optimization, and it's actually a requirement of any Scheme interpreter or compiler to do it.

Here are some general guidelines for writing tail-recursive functions:

Write a helper function with an extra accumulator variable to store all the extra information associated with the recursion.
The base case return value of the helper function will just be the value stored up in the accumulator argument. (If it's a list, you will often have to reverse it.)
Make sure the helper function is tail recursive! The computation that you used to do outside the recursive call usually goes inside it now, in the accumulator argument.
The original function now becomes very simple: it just calls the helper function. The initial value of the accumulator is what used to be returned in the base case (usually 0 or null).