Today we will see how to use automated tools to generate a scanner and parser based on a token spec (regexes) and grammar that we provide, and then how to use the resulting parse tree to write an interpreter for the calculator language.

We will use these same tools for the interpreters and compilers we create in labs for the rest of the semester:

Getting the code

Download calc.tgz.

Extract this tarball (in Linux) with the command:

tar -xzvf calc.tgz

That will create a folder calc with the following subfolders and files:

Note: Maven (and Java in general) is kind of annoying in what folders things go in, as you can see from the names above. These rules are strict, based on the package name si413 that we will use for our code this semester. Pay attention to the folder names! If you use an IDE, it will probably know about maven and be able to help you.

Compiling and running

Go into the newly-created calc folder on the command line, and run

mvn package

That one command tells maven to:

If the mvn command doesn’t work at all, on your VM, you can run

sudo apt install maven

to install it. Otherwise, ask your instructor if you run into issues.

To run the interpreter on the example program, you can mvn package to get the jar file, and then run

java -jar target/calc-1.0.jar calc-ex.prog

But this process (packaging the jar, then running it) is kind of painful to keep doing when you are developing code. I made a small bash script run.sh to help that. By default, it compiles and then runs the main method in the Interpreter.java class. So you can do

./run.sh calc-ex.prog

Understanding

The three crucial files which control the behavior of the interpreter are tokenSpec.txt, ParseRules.g4, and Interpreter.java. Look through these and make sure you understand how the pieces fit together.

When we run a program like

print(3*4)

what is happening in the tokenizer, the parser, and then in Interpreter.java? Where (in code) does the actual multiplication happen? Where does the actual printing occur?

Enhancements

Today in class you will get comfortable with this new build setup by making some changes to the language specs (tokens and grammar), and the interpreter itself (java file), and then compiling, running, and debugging using maven.

Try making the following enhancements to the calc language interpreter:

  1. The main statement commands are print and save. To avoid unnecessary tying, allow shortened versions of these commands with a single character like ! or $.

    (Should only require changing tokenSpec.txt)

  2. Right now, parentheses can’t be used for grouping sub-expressions, like

    print(2 * (3 + 4))
    

    Add this support for parenthesized expressions.

    (Hint: you need to add a new grammar rule for expr, and then add a new method in the ExpressionVisitor subclass inside Interpreter.java.)

  3. The parentheses used for print and save statements aren’t actually necessary in this language. Get rid of them, so a program like this will work:

    print 1 + 2
    save 17 - 5
    print x * 2 + 20
    

    (Should require changing just one file - which one?)

  4. Add an exponentiation operator ^, so that we can write an expression like 3^4 and get 81.