In today’s class we are introducing a new programming language called Scheme. We will just get a small introduction to the language today, and explore it further in later classes.

If you want to try any of the examples, just go to https://try.scheme.org/.

What is Scheme?

Scheme is a programming language whose roots are deeply intertwined with the history of computer science itself. To summarize very briefly:

Alonzo Church was a mathematician and logician at Princeton in the 1930s. He developed the notion of “universal computation” built from functions and function evaluation, called the lambda calculus.

(Famously, Church’s Ph.D. student Alan Turing developed an alternative model of universal computation based on symbols on tapes and state transitions, which you learned about in theory class.)
Based on Church’s lambda calculus, John McCarthy developed a programming language called Lisp in the 1950s. This was one of the very earliest programming languages of any kind, and it was a critical connection from theory to practice in the early days of computing.

McCarthy spent most of his career at Stanford working on the foundations of artificial intelligence — in fact, he is credited with first coining that term! He received the Turing Award in 1971.
Scheme is a variant and (somewhat) simplification of Lisp, developed by two MIT professors Guy L. Steele and Gerald Jay Sussman, in the 1970s.

Steele was a Ph.D. student when he co-developed the Scheme language. He was awarded the Grace Murray Hopper Award in 1988 for this and other contributions early in his career. Later, he became well known for his clear and concise technical writing, and joined the team that developed the Java programming language at Sun Microsystems in the 1990s.

Sussman is an A.I. researcher at MIT who was also heavily influential in computer science education. Along with Harold Abelson and Julie Sussman, he wrote the “wizard book”, largely based on the Scheme language, which changed the way many universities approached the teaching of undergraduate computer science.

Key characteristics of Scheme

Functional Paradigm: Functions are first-class citizens. You will pass functions to other functions, return them as results, and store them in data structures.
Minimalist Syntax: The syntax is uniform and simple, based on a single data structure: the list. This is often described as “homoiconicity” – the code itself is represented as a fundamental data structure of the language.
Dynamic Typing: Types are checked at runtime, not compile time.
Automatic Memory Management: Garbage collection is a core feature.

The beauty (and initial challenge) of Scheme is its simplicity. (Almost) everything is an expression that evaluates to a value.

Formal Syntax: S-Expressions

Scheme programs are composed of S-expressions (Symbolic Expressions). We can define their structure formally.

Tokens (Lexical Analysis)

Name     Regex                  Examples
====     =====                  ========
LP       \(                     (
RP       \)                     )
QUOTE    '                      '
INT      -?[0-9]+               123
BOOL     #[tf]                  #t
STR      "([^"\\]|\\.)*"        "hello world"
SYM      [^-()#"';0-9\s]\S*     variable-name
ignore   ;.*                    ; comment
ignore   \s

Note on Symbols: Symbols are essentially identifiers. Unlike strings, they are atomic and often used to represent variables or operators. The symbol foo is a single, indivisible token. And unlike in most other programming languages, symbols can contain (and even start with) all kinds of special characters as long as they don’t have another special Scheme meaning, so for example + and <= are both considered symbols.

Grammar (Syntactic Analysis)

An S-expression can be defined with a simple context-free grammar using the terminals defined in the table above.

explist -> exp explist
        -> ε
exp -> INT
    -> BOOL
    -> SYM
    -> STR
    -> LP explist RP
    -> QUOTE exp

Essentially, a Scheme program is either a single atom, or a parenthesized list (which contains other S-expressions), or a quoted S-expression.

I want you to appreciate how small and concise this syntax specification is. Despite being a complete, powerful programming language, it has a smaller syntax specs than many of the languages you implemented from Unit 1 in this class! This simplicity is one of the attractive and powerful things about Scheme.

Core Concepts & Constructs

Expressions and Evaluation

In Scheme, (almost) everything is an expression that evaluates to a value.

Self-Evaluating Atoms: Numbers, booleans, and strings evaluate to themselves.
Symbols: Symbols are evaluated as variable lookups.
Lists (Function Calls): A non-empty list is typically interpreted as a function call.
- Syntax: (operator operand1 operand2 ...)
- Evaluation Rule:
  1. Evaluate the operator. It must evaluate to a procedure (function).
  2. Evaluate each operand from left to right.
  3. Apply the procedure to the evaluated operands.

Example:

(+ 3 4) ; Evaluates to 7
; 1. `+` is a symbol that evaluates to the addition procedure.
; 2. `3` evaluates to 3.
; 3. `4` evaluates to 4.
; 4. The addition procedure is applied to 3 and 4.

This prefix notation is consistent for all operations.

(> 10 5) ; Evaluates to #t
(- (* 2 3) 1) ; Evaluates to 5

Defining Constants and Procedures

As we said before, almost every S-expression in Scheme is an expression that evaluates to a value. One exception is the define special form, which is used to declare variables. (A “special form” looks like a function call but has its own evaluation rules.)

Defining a Constant: (define <symbol> <S-exp>)

(define ten 10)
(define great-fun "mountain biking")
(define also-ten (* 5 (+ 1 1)))

Defining a Procedure (Function): The fundamental way to create a function is with lambda.

(lambda (<param1> <param2> ...) <body>)

A lambda expression creates an anonymous procedure.

(lambda (x) (* x x)) ; A procedure that takes one argument `x` and returns its square.

To use it, you can either apply it directly or, more commonly, give it a name with define.

; Direct application
((lambda (x) (* x x)) 5) ; Evaluates to 25

; Naming the procedure
(define square (lambda (x) (* x x)))
(square 5) ; Evaluates to 25

Scheme provides syntactic sugar for defining named functions, which you will see more often: (define (<name> <param1> ...) <body>)

; The canonical way to define a named function
(define (square x)
  (* x x))

This form is automatically translated into the (define square (lambda ...)) structure internally.

If expressions

Most languages you are used to have if statements, like

if (some_condition) {
  run_this_code_if_true();
}
else {
  run_this_code_if_false();
}

They also often have an if/else expression that evaluates to a value, sometimes called the “ternary if”. In C this is written like:

some_condition ? value_if_true : value_if_false

But because C is more of an imperative language focused on executing commands (statements), the first form is much more common.

Scheme on the other hand is primarily a declarative language which is focused on computing the values of expressions. So the primary way to do if/else in scheme is with an expression, like

(if (>= temp 70) "nice" "chilly")

In general, the form is (if some_condition value_if_true value_if_false).

A few things to note here:

This is an expression, which means it always returns a value. So you typically want to have all three parts, including the else clause.

The fact that it’s an expression also means it can be nested arbitrarily. That’s the benefit! For instance, we could expand the previous example:

(string-append "Outside is "
               (if (>= temp 70)
                   (if (<= temp 90)
                       "nice"
                       "hot")
                   "chilly"))

Quoting and Lists: Code as Data

A core idea in Lisp and Scheme is homoiconicity: code is represented by the language’s own data structures. To handle code as data, we must tell the interpreter not to evaluate an expression. This is the role of quote.

The quote special form takes one argument and returns it without evaluating it. The single quote character (') is syntactic sugar for quote.

Quoting Symbols: Normally, a symbol is treated as a variable to be evaluated. Quoting it gives you the symbol itself.

(define my-sym 'foo)
; my-sym is now bound to the symbol 'foo', not the value of a variable named foo.

Quoting Lists: This is the most common use. It allows us to create a list of data.

'(1 2 3) ; This is the list of numbers 1, 2, 3. It will not be evaluated.
'(a b c) ; A list of symbols. Without the quote, Scheme would try to call function 'a'.

Fundamental List Operations

In Scheme, a list is formally defined as either:

The empty list: '()
A pair (cons A B) where A is any value and B is a list.

This recursive structure leads to three crucial functions: cons, car, and cdr.

cons (Construct Pair) cons takes two arguments, an element and a list, and returns a new list with the element added to the front.

(cons 1 '(2 3 4))    ; Evaluates to '(1 2 3 4)
(cons 'a '())        ; Evaluates to '(a)
(cons '(1 2) '(3 4)) ; Evaluates to '((1 2) 3 4)

car (Contents of the Address Register) car returns the first element of a non-empty list.

(car '(a b c))     ; Evaluates to 'a
(car '((1 2) 3 4)) ; Evaluates to '(1 2)

cdr (Contents of the Decrement Register) cdr (pronounced “could-er”) returns the list containing everything after the first element.

(cdr '(a b c))  ; Evaluates to '(b c)
(cdr '(a))      ; Evaluates to '()

These are the fundamental building blocks for all list processing in Scheme. For example, to get the second element of a list, you combine car and cdr:

(car (cdr '(a b c))) ; (cdr '(a b c)) -> '(b c) -> (car '(b c)) -> 'b

(In fact, this operation is so common that it has a shortcut name, cadr.)