You may want to use the GUI by running gst-blox
for
development. However, I will test your code by running the command-line
version gst
as described below.
Save your program in a file called proj.st
in folders called proj1
and proj2
respectively for phases 1 and 2 of your project.
I will test your code in the same environment as the lab
machines in MI 302, using the commands
/usr/bin/gst proj.st
For phase 2 of your project, you will write a program to generate and
then execute a bottom-up parser. You should submit your program in a folder
called proj2
, and be sure that is runs as described above.
Your program will take as input the output from a call
to bison -v
that we saw in Lab 5.
That is, we will let bison generate and describe the CFSM for bottom-up parsing
in some language, and then your program will actually create this CFSM and
attempt to parse a string of tokens in the language.
Specifically, your program should take as input a file called
spec.output
. The format of this file is the same as any other
.output
file produced when you run bison -v
. The
file contains, in order:
$end
.Your program must read in this spec.output
file and create the
CFSM that is described there. This CFSM will simply be an ordered array of
states. You should make a new object type for a CFSM state that will contain
information about the transitions and actions for that state.
After creating the CFSM, you should read a series of strings from standard in (which you may safely assume are all token names), and parse that string of tokens using the CFSM. This will involve maintaining a stack of symbol-state pairs as described in class, an indicator of the current (or next) state, and the saved value of the next "peeked at" token from the input. Your parser doesn't need to do any interpreting, but it should print out the symbols of the stack at every step in the process. These should be printed on a single line, separated by spaces, from the bottom of the stack to the top. So for instance a partial stack in parsing the Scheme language might be printed as
LP exprseq LP expr
And of course
your program should
also identify and report any parse errors along the way. After reading
the token $end
, your program should halt. That is, you only have
to parse a single stream of tokens.
The CFSM state object should contain code to actually perform the parser actions. I suggest you store a list of symbol-action pairs in each state, where each action is either a positive integer, meaning to shift and go to that numbered state, or a negative integer, meaning to reduce by that (negative) numbered rule in the grammar. After peeking at the next token, the CFSM transition function should either produce a parse error if there is nothing to do for that kind of token, or else perform the specified shift or reduce action. For a shift, this means adding the next token to the top of the stack and updating the current state. For a reduce, this means looking at the specified grammar rule, popping a certain number of symbols off the stack, adding a certain non-terminal to be the next "peeked at" symbol on the input stream, and finally updating the current state to be the saved state from the top of the stack.
I suggest you write your program in the following steps. Of course you are free to develop however you wish. As always, you are encouraged to submit every time you get some small step of the program working.
spec.output
file.
Instead, start by making your object definitions and the global storage
for the stack, next state, and peeked-at input symbol.
spec.output
file. I would suggest creating a
spec.output
CFSM description that exactly matches a
manually-created CFSM that you have working. Then, piece by piece, remove
the parts of your program that specify some part of the parser, and instead
read that input from the spec.output
file. Stop often for
testing!
.ypp
bison specification files
that we have used in class and in labs can generate a .output
file that your program should be able to read. Examine the stack contents
at each step to make sure the bottom-up parse is executing correctly.
You do not have to create your program in the exact way that I have suggested. However, for full credit your program should
The file spec.output is bison's output for a very simple grammar for adding and subtracting numbers. If this file is in the current directory, and I compile and run your program as above, then by typing the string
NUM OPA NUM $end
into standard in, your program should write
NUM exp exp OPA exp OPA NUM exp st st $end
to standard out, and exit with status 0.
If I typed the string
NUM OPA OPA $end
instead, your program should write
NUM exp exp OPA Parse error!
and exit with nonzero exit code.