Due: before class on Monday 20 October
Looking back
There are plenty of review problems today to get you ready for the midterm next week.
-
Which of the following statements about language fundamentals are true? (Select all that apply)
- The rules for what code looks like are its semantics, while the rules for what it means are its syntax.
- An expression is a piece of code that evaluates to a single value.
- A statement is a piece of code that tells the program to perform an action but does not itself evaluate to a value.
- A variable declaration, such as
String creature = "axolotl";, is an example of an expression.
-
The following LLVM IR code is invalid. Why?
define i32 @calculate() { %result = add i32 10, 5 %result = mul i32 %result, 3 ret i32 %result }- The
mulinstruction is not a valid LLVM command. - It violates the Single Static Assignment (SSA) rule by reassigning a
value to
%result. - It does not use the Three-Address Code (3AC) format.
- The function is missing a
target tripledeclaration.
- The
-
A situation where a language syntax says a piece of code is valid but does not define exactly what should happen when it runs is called what?
- A compilation failure
- A syntax error
- Underspecified Behavior (UB)
- A runtime exception
-
Which of the following descriptions of compilers vs interpreters are correct? Select all that apply.
- A compiler reads source code and executes it directly, while an interpreter translates it into another language like machine code.
- An interpreter will translate code into an abstract syntax tree (AST), whereas a compiler instead uses a parse tree.
- Languages which have dynamic typing such as Scheme tend to be compiled rather than interpreted
- Low-level systems languages such as Rust tend to be interpreted rather than compiled.
- An interpreter typically includes more stages of optimizations than a compiler.
-
The typical structure of an LLVM instruction, such as
%dest = command %src1, %src2, is an example of which format?- Single Static Assignment (SSA)
- Abstract Syntax Tree (AST)
- Three-Address Code (3AC)
- Object-Oriented Programming (OOP)
-
What are the primary advantages of writing complex logic in a C helper function and calling it from a main program written in LLVM IR? (Select all that apply)
- It allows you to bypass the need to write complex, low-level LLVM IR for things like loops and conditionals by hand.
- The final executable will always run faster than one generated purely from handwritten LLVM IR.
- It simplifies the process by breaking down a hard problem into a more manageable one.
- It is the only way to link against standard libraries like
stdio.h.
-
What is the most common and reasonable way to determine the correct LLVM IR for a specific C language feature?
- Read the official LLVM language reference manual from start to finish.
- Write a small C program using the feature and compile it with
clang -S -emit-llvmto inspect the output. - Use the
lliinterpreter to experiment with different commands until one works correctly. - Modify an existing
.llfile and see if it still compiles.
-
When scanning for string literals, why is the regular expression
"[^"]*"a better choice than".*"?- The
.character does not match whitespace characters. - The
[^"]*pattern correctly handles escaped quotes (\") inside a string. - The
.*pattern is “greedy” and can incorrectly match everything between the first and last quote on a line, such as in"one" + "two". - The
*quantifier is more efficient when used with a character class.
- The
-
A scanner encounters the character sequence
if_stmt. Given thatifis a keyword and identifiers are defined as sequences of letters and underscores, how will this sequence be tokenized?- As
KEYWORD:ifandIDENTIFIER:_stmt, because keywords have higher precedence. - As a single token,
IDENTIFIER:if_stmt, because the scanner prefers the longest possible match. - It will cause a syntax error, as a keyword cannot be part of an identifier.
- As two separate tokens, because the parser needs to identify the
ifkeyword first to understand the program’s structure.
- As
-
Which of the following are characteristics of a declarative programming paradigm, such as the one favored in Scheme? (Select all that apply)
- The code explicitly describes the control flow and step-by-step state changes.
- The code focuses on defining the logic of what result should be computed.
- The programmer primarily uses mutable state and iterative loops to achieve results.
- Computation is treated as the evaluation of mathematical functions and expressions.
-
The Lisp family of languages, including Scheme, is known for its ability to represent its own code as a fundamental data structure (e.g., a list). What is the term for this property, and which feature is used to treat code as data by preventing its evaluation?
- Dynamic Typing;
define - Homoiconicity;
quote - Referential Transparency;
cons - First-Class Functions;
lambda
- Dynamic Typing;
-
Why are comments and whitespace typically handled by the scanner and discarded before the parsing phase begins?
- Context-free grammars are incapable of describing the syntax for comments or whitespace.
- To simplify the grammar for the parser, which would otherwise need to account for these elements in many different rules.
- Because scanners use regular expressions, which are inherently more powerful than the grammars used by parsers.
- Discarding whitespace is a necessary first step for all types of code optimization techniques.
-
Which statement accurately contrasts the core operations of top-down and bottom-up parsers?
- A top-down parser’s key decision is to shift or reduce; a bottom-up parser’s is to predict a grammar rule.
- A top-down parser starts from the grammar’s root and predicts which production to use to expand the tree downwards. A bottom-up parser starts from the tokens and reduces sequences to build the tree upwards.
- Top-down parsers build the tree from leaves to root, while bottom-up parsers build from root to leaves.
- Top-down parsers are only suitable for functional languages, while bottom-up parsers are used for imperative ones.
-
A scanner spec produces a single, combined DFA to help tokenize the input. Select all true statements about this final, combined scanner DFA.
- Each accepting state is labeled with the highest-priority token it recognizes.
- Transitions on whitespace characters often lead to a special state that discards the input rather than producing a token.
- To be a valid DFA, every state must have a transition for every possible character, often including transitions to a non-accepting “trap state” for invalid sequences.
- The total number of states in the combined DFA is always the sum of the states from the individual DFAs.
-
How would a Scheme interpreter process the expression
((lambda (x) (+ x 5)) 10)?- It multiplies
lambdabyxand5, then applies this result to10. - It identifies
(lambda (x) (+ x 5))as an expression that evaluates to a procedure, evaluates10to itself, and then applies the procedure to the argument10. - It reports an error because a
lambdaexpression cannot be the first element in a function call. - It treats the entire expression as a literal list of data due to the nested parentheses.
- It multiplies
-
Which principles are central to the functional programming style? (Select all that apply)
- Treating functions as “first-class citizens” that can be passed as arguments or returned as results.
- Preferring immutability, where operations create new data structures instead of modifying existing ones.
- Relying on iterative loops like
forandwhileas the primary means of repetition. - Focusing on direct, sequential modification of program state to achieve results.
-
What is the fundamental relationship between a context-free grammar and a valid parse tree?
- The leaves of the tree are nonterminals, while the internal nodes are tokens from the input stream.
- The root of the tree must be the grammar’s designated start symbol.
- Redundant or unnecessary tokens are removed from the parse tree for simplicity and readability.
- The left-to-right order of the leaf nodes (tokens) in the tree does not necessarily correspond to their order in the original input.
Looking ahead
-
Go through the first chapter in the tour of rust. It should only take a few minutes.
To check your understanding, consider the following function:
fn foo(a: [i32; 3]) -> bool { let x = a[0] < a[1]; let y = a[1] < a[2]; x and y }-
There is a syntax error in the function definition. What is the error? (Hint: you can copy-paste this into the rust playground) and try it!)
-
What does this function do?
-