Homework 6: Grammars and Parsing
- Print version, with cover page (pdf)
- Due Date: Monday, October 4
1 Fixing a grammar
The following grammar defines a language for (opening) HTML tags, like <input checked type="checkbox">
start →
LA
innerRA
inner →NAME
attrs
attrs → attrs single
attrs → \(\varepsilon\)
single →NAME
EQ
VAL
single →NAME
Fix this grammar (without modifying the language it accepts!) to make it LL(1). Write your new, updated grammar below.
2 Top-down parsing
Using your LL(1) grammar from the previous question, show the partial parse tree that would result from reading the following sequence of tokens, in a top-down parser.
LA
NAME
NAME
NAME
EQ
VAL
Note, this sequence does not parse all the way up to the start symbol. So it will be a partial parse tree, with some items un-expanded or un-matched.
3 Bottom-up parsing
Now use the original grammar from problem 1 to complete a partial parse of the same sequence of tokens, but this time using a bottom-up parsing strategy. Here are the grammar and tokens again:
start →
LA
innerRA
inner →NAME
attrs
attrs → attrs single
attrs → \(\varepsilon\)
single →NAME
EQ
VAL
single →NAME
LA
NAME
NAME
NAME
EQ
VAL
Again, this will be a partial parse. But for bottom-up parsing, that means we will have a “forest” of multiple trees, that haven’t yet combined all the way to get back to the start symbol.
Assume the next token of look-ahead is RA
, in order to combine the partial parse trees (reduce) as much as possible.