Lab 5: Parse Lab II

This lab contains both written and electronic parts, to be submitted separately. Each exercise is marked according to whether it should be part of the electronic or written submission. Both parts are due at the beginning of your lab period next week, as usual. And as always, you are advised to type up your written parts whenever possible so that they are legible.

You can (and should) work with a parter for this lab, and if so, only one of you should submit each part. And of course include both your names on everything you submit, electronically or on paper.

Your electronic submission should come from a folder called lab05 and include the following files. You can download all the starter code at once with the file lab05.tar.gz.

A grammar with a problem

We're going to consider a grammar for the language we use to define ... grammars. We have rules with symbols and arrows and bars for "or". These descriptions must themselves obey a grammar:

SS rule | rule
ruleSYM ARROW rlist
rlistrlist OR rhs | rhs
rhsrhs SYM | SYM

These rules are coded in a way Bison can understand in the file ex1.ypp. Take a look at this file, download it, and run bison on it.

Bison will produce an error message. The exercises below are about this error and what you can do to fix it. To help see the error, I recommend you consider the following valid input:

X -> a b   Y -> c d
where X,Y,a,b,c,d are all SYM tokens and -> is an ARROW token. Trace through the parse modelling the stack (which remember contains states as well as grammar symbols) at each step. Determine where you first run into a conflict, and try to figure out why it really happens.

Exercises:

  1. (written) What error message does bison give?
  2. (written) Run bison -v ex1.ypp, which will produce the file ex1.output. This file lists the states of the parser's CFSM along with the actions for each state, depending on the lookahead token.
    Draw (on paper) the part of the CFSM encompassing states 1,4,7,8,9,10,11, and 12. Draw this like we did in class, i.e., you don't have to worry about indicating where reduce actions should occur.
  3. (written) List the states that have conflicts, and explain why the ambiguity occurs. For each conflicted state, give an example valid string in the language that would cause this conflict to arise. (Come up with your own examples. I don't want to see the same example between two different submissions.)
  4. (written) Would more than one token of lookahead help? Explain.

Fix the problem grammar

You should be able to convince yourself that the conflicts aren't going to go away by simply refactoring the grammar. We have to change the language itself.

Exercises:

  1. (written) Modify the language so that the grammar will be SLR(1). Write out the token names, including any new tokens, and the grammar rules for your modified language. Try to make the change as minimal as possible.
  2. (electronic) Modify the ex1.lpp and ex1.ypp files to account for your change. Add an entry to features.txt for this exercise, including a short (plain language) description of how the language has been modified.

Another grammar with a problem

Consider the following grammar, which is supposed to be the basis for parsing expressions like 3*12 > 15+3 > 10;. The new token COMP is a comparison operator like > or <.
runres STOP run | res STOP
resexp | res COMP res
expexp OPA term | term
termterm OPM factor | factor
factorNUM | LP exp RP

This grammar is written for bison in ex2.ypp. Run bison -v ex2.ypp and you should see another error...

Exercises:

  1. (written) Which state has a conflict? Give an example of a valid string in the language that would make this conflict arise.
  2. (written) Modify the grammar only (not the language!) to make this conflict go away. You shouldn't need any new tokens. Just write the (fixed) grammar.
  3. (electronic) Modify ex2.ypp to incorporate your fix, and compile it to verify that bison no longer gives an error. Add an entry to features.txt indicating that you did this exercise. You shouldn't have to add any description there, since you didn't change the language!