Due: before class on Monday 22 September
Looking back
Of course. Here are the questions formatted in Markdown without the answers.
-
Consider the following token specifications, in order of priority:
IF: if ID: [a-z0-9]+The scanner is processing the input string
if9. According to the principles of scanning described in the notes, what token(s) will be produced?- An
IFtoken with the valueif, followed by an error. - An
IFtoken with the valueif, followed by anIDtoken with the value9. - A single
IDtoken with the valueif9. - A single
IFtoken with the valueif9.
- An
-
Suppose you have an ANTLR-based interpreter for the calculator language and want to add a new logical
ANDoperator (&&) to the language. Which of the following actions are necessary to fully implement this feature, from token definition to execution? (Select all that apply.)- Modify
tokenSpec.txtto define the regex for the new operator token. - Modify
Interpreter.javato handle the new grammar rule and perform the logicalANDoperation. - Modify the
pom.xmlfile to include a new dependency for logical operations. - Modify
ParseRules.g4to add a new production rule for a logicalexpr. - Modify
Tokenizer.javaso that it can handle the character&
- Modify
-
Suppose a top-down parser uses the following grammar for a simple graphics language:
prog -> cmd prog -> ε cmd -> DRAW shape STYLE ID shape -> CIRCLE -> RECTANGLEThe top-down parser has successfully matched a
DRAWtoken and now needs to expand theshapenonterminal. It uses look-ahead and sees that the next token in the input stream isRECTANGLE. What is the parser’s correct action?- It should report a syntax error because
RECTANGLEis ambiguous. - It should shift the
RECTANGLEtoken onto the parse stack. - It must look ahead an additional token (to
STYLE) to make a decision. - It should predict and expand the rule
shape -> RECTANGLE.
- It should report a syntax error because
-
Why are comments and whitespace handled by the scanner instead of the parser?
- Because the grammar rules for comments and whitespace are too complex for a CFG to handle.
- To make the parsing stage simpler and more efficient by removing tokens that don’t affect the program’s structure.
- Because the parser can’t access the original text spelling, only the token types.
- To allow an IDE to easily apply syntax highlighting to comments.
-
Consider a simple language for controlling a robot on a 2D plane. Below is an informal description of the language and a short example program.
Language Description: A program consists of one or more commands, each ending with a semicolon. The language is case-sensitive. There are two kinds of commands:
- A
movecommand, which takes a single positive integer (the distance to move forward). - A
turncommand, which takes a direction keyword (leftorright) followed by a positive integer (the number of degrees to turn).
Whitespace (spaces, tabs, newlines) and comments (from a
#to the end of the line) should be ignored by the scanner.Example Program:
# Draw two sides of a square move 100; turn right 90; move 100;Your Task: Based on the description and example, write a scanner spec (list of token types and regexes) and parser spec (context-free grammar) for this language.
- A