In today’s class we are introducing a new programming language called Scheme. We will just get a small introduction to the language today, and explore it further in later classes.

If you want to try any of the examples, just go to https://try.scheme.org/.

What is Scheme?

Scheme is a programming language whose roots are deeply intertwined with the history of computer science itself. To summarize very briefly:

Key characteristics of Scheme

The beauty (and initial challenge) of Scheme is its simplicity. (Almost) everything is an expression that evaluates to a value.

Formal Syntax: S-Expressions

Scheme programs are composed of S-expressions (Symbolic Expressions). We can define their structure formally.

Tokens (Lexical Analysis)

Name     Regex                  Examples
====     =====                  ========
LP       \(                     (
RP       \)                     )
QUOTE    '                      '
INT      -?[0-9]+               123
BOOL     #[tf]                  #t
STR      "([^"\\]|\\.)*"        "hello world"
SYM      [^-()#"';0-9\s]\S*     variable-name
ignore   ;.*                    ; comment
ignore   \s

Note on Symbols: Symbols are essentially identifiers. Unlike strings, they are atomic and often used to represent variables or operators. The symbol foo is a single, indivisible token. And unlike in most other programming languages, symbols can contain (and even start with) all kinds of special characters as long as they don’t have another special Scheme meaning, so for example + and <= are both considered symbols.

Grammar (Syntactic Analysis)

An S-expression can be defined with a simple context-free grammar using the terminals defined in the table above.

explist -> exp explist
        -> ε
exp -> INT
    -> BOOL
    -> SYM
    -> STR
    -> LP explist RP
    -> QUOTE exp

Essentially, a Scheme program is either a single atom, or a parenthesized list (which contains other S-expressions), or a quoted S-expression.

I want you to appreciate how small and concise this syntax specification is. Despite being a complete, powerful programming language, it has a smaller syntax specs than many of the languages you implemented from Unit 1 in this class! This simplicity is one of the attractive and powerful things about Scheme.

Core Concepts & Constructs

Expressions and Evaluation

In Scheme, (almost) everything is an expression that evaluates to a value.

Example:

(+ 3 4) ; Evaluates to 7
; 1. `+` is a symbol that evaluates to the addition procedure.
; 2. `3` evaluates to 3.
; 3. `4` evaluates to 4.
; 4. The addition procedure is applied to 3 and 4.

This prefix notation is consistent for all operations.

(> 10 5) ; Evaluates to #t
(- (* 2 3) 1) ; Evaluates to 5

Defining Constants and Procedures

As we said before, almost every S-expression in Scheme is an expression that evaluates to a value. One exception is the define special form, which is used to declare variables. (A “special form” looks like a function call but has its own evaluation rules.)

Defining a Constant: (define <symbol> <S-exp>)

(define ten 10)
(define great-fun "mountain biking")
(define also-ten (* 5 (+ 1 1)))

Defining a Procedure (Function): The fundamental way to create a function is with lambda.

(lambda (<param1> <param2> ...) <body>)

A lambda expression creates an anonymous procedure.

(lambda (x) (* x x)) ; A procedure that takes one argument `x` and returns its square.

To use it, you can either apply it directly or, more commonly, give it a name with define.

; Direct application
((lambda (x) (* x x)) 5) ; Evaluates to 25

; Naming the procedure
(define square (lambda (x) (* x x)))
(square 5) ; Evaluates to 25

Scheme provides syntactic sugar for defining named functions, which you will see more often: (define (<name> <param1> ...) <body>)

; The canonical way to define a named function
(define (square x)
  (* x x))
This form is automatically translated into the (define square (lambda ...)) structure internally.

If expressions

Most languages you are used to have if statements, like

if (some_condition) {
  run_this_code_if_true();
}
else {
  run_this_code_if_false();
}

They also often have an if/else expression that evaluates to a value, sometimes called the “ternary if”. In C this is written like:

some_condition ? value_if_true : value_if_false

But because C is more of an imperative language focused on executing commands (statements), the first form is much more common.

Scheme on the other hand is primarily a declarative language which is focused on computing the values of expressions. So the primary way to do if/else in scheme is with an expression, like

(if (>= temp 70) "nice" "chilly")

In general, the form is (if some_condition value_if_true value_if_false).

A few things to note here:

Quoting and Lists: Code as Data

A core idea in Lisp and Scheme is homoiconicity: code is represented by the language’s own data structures. To handle code as data, we must tell the interpreter not to evaluate an expression. This is the role of quote.

The quote special form takes one argument and returns it without evaluating it. The single quote character (') is syntactic sugar for quote.

Quoting Symbols: Normally, a symbol is treated as a variable to be evaluated. Quoting it gives you the symbol itself.

(define my-sym 'foo)
; my-sym is now bound to the symbol 'foo', not the value of a variable named foo.

Quoting Lists: This is the most common use. It allows us to create a list of data.

'(1 2 3) ; This is the list of numbers 1, 2, 3. It will not be evaluated.
'(a b c) ; A list of symbols. Without the quote, Scheme would try to call function 'a'.

Fundamental List Operations

In Scheme, a list is formally defined as either:

  1. The empty list: '()
  2. A pair (cons A B) where A is any value and B is a list.

This recursive structure leads to three crucial functions: cons, car, and cdr.

cons (Construct Pair) cons takes two arguments, an element and a list, and returns a new list with the element added to the front.

(cons 1 '(2 3 4))    ; Evaluates to '(1 2 3 4)
(cons 'a '())        ; Evaluates to '(a)
(cons '(1 2) '(3 4)) ; Evaluates to '((1 2) 3 4)

car (Contents of the Address Register) car returns the first element of a non-empty list.

(car '(a b c))     ; Evaluates to 'a
(car '((1 2) 3 4)) ; Evaluates to '(1 2)

cdr (Contents of the Decrement Register) cdr (pronounced “could-er”) returns the list containing everything after the first element.

(cdr '(a b c))  ; Evaluates to '(b c)
(cdr '(a))      ; Evaluates to '()

These are the fundamental building blocks for all list processing in Scheme. For example, to get the second element of a list, you combine car and cdr:

(car (cdr '(a b c))) ; (cdr '(a b c)) -> '(b c) -> (car '(b c)) -> 'b

(In fact, this operation is so common that it has a shortcut name, cadr.)