Grades and deadlines

Initial deadline: 23:59 on Tuesday 23 September
How to submit: Turn in your completed lab2.2.yml file, along with everything in the src/main folder of your maven project.

You must submit via the command line in order to preserve the folder structure. Use either of these two commands to do it:
```
club -csi413 -plab2.2 -f lab2.2.yml src/main
```
or
```
submit -c=si413 -p=lab2.2 lab2.2.yml src/main
```
(you can download the club tool here)

If you used an AI tool to help with this lab (you should use Gemini), turn in a file aichat.md as well. Remember the course guidelines for the use of generative AI on labs.
Grading:

In this lab you will complete the following tasks:
1. Choose one of the winning languages by your classmates from last week’s lab
2. Write a working interpreter in Java using ANTLR and Apache maven
3. Thoroughly test your interpreter yourself on a wide variety of statements and small programs in your language, so you are confident it works 100%.
If your submission meets the requirements for each task, you will receive 10 points towards your total lab grade.

Resubmissions:

We will follow the same resubmission policy for all labs this semester:

def points_earned(deadline, max_points, previous_submission=None):
    while current_time() < deadline:
        wait()
    submission = get_from_submit_system()
    if meets_all_requirements(submission):
        return max_points
    elif significant_progress(previous_submission, submission):
        return points_earned(deadline + one_week, max_points, submission)
    else:
        return 0

Getting started: files

Make a new directory for this lab.

Download the yaml file for today into your lab directory and be sure to fill it in with your language name.

Copy everything from the calc folder into your new folder. You should already have downloaded the calc language interpreter in our in-class exercise with the calc language and ANTLR

This will be your “starter code” for this week’s lab. But the language you are implementing will be vastly different from the calc language! So you will end up ripping out much of what’s there and replacing it with new token specs, grammar, and parse tree visit methods corresponding to your language for this lab.

Task 1: Choose a language

Here are specs for the winning languages invented by your classmates in last week’s lab. During lab, your section leader will run a “draft” to decide who works on what language.

Take note of who is working on the same language as you! This applies across all sections of SI413:

If someone is working on the same language as you, they are eligible to act as a “final reviewer” for this lab. That should involve only testing your complete, working interpreter, not sharing any source code or helping with debugging.
If someone is working on a different language than you, then it is OK for you to help each other with debugging your code for the lab, as long as this collaboration is clearly documented in the code and in your YAML file.

Example Program

I recommend you start by writing some example program in your language, but it doesn’t need to be comprehensive and you don’t need to turn it in. If you write a small program that you have in mind as your first target for interpreting, you could tokenize it and build the parse tree by hand, which will help you think about how your initial interpreter should work to evaluate that program.

Task 3: Writing your interpreter

Write a complete, working interpreter for your programming language, in Java, using the Tokenizer class provided and the ANTLR parser generator.

I will compile and run your code like

mvn compile
./run.sh input-program.txt

Critical files for your implementation

There are really three files you should be focusing on for your implementation:

src/main/resources/si413/tokenSpec.txt contains the scanner specification, which is a prioritized list of token names and corresponding Java regexes.

This file is read in by the Tokenizer class before processing the source code and producing a token stream.
src/main/antlr4/si413/ParseRules.g4 contains a specification for the ANTLR parser generator. That specification amounts to a listing of the tokens (which should match those in your tokenSpec.txt), and then a context-free grammar where each production rule is tagged with a #UniqueName.

When you compile your project using maven, it starts by running the ANTLR tool to read this .g4 file and produce java classes for the actual custom parser it generates for your language.
src/main/java/si413/Interpreter.java is the actual Java code you will be writing to make the interpreter work. It has the main method to process the command-line filename argument, and the logic to call the tokenizer, then the parser, to produce a parse tree. Finally, the program is executed by “visiting” the parse tree nodes starting with the root node.

The main logic here is in the inner classes that perform this “visit” action on the parse tree nodes. The existing methods from the calc language won’t make sense anymore, but you will replace them with methods corresponding to the rules in your grammar.

Tips on how to proceed

Start with the syntax specification (tokens and grammar). The format of the token spec file should be straightforward, but the format of an ANTLR grammar is slightly different than what is in the language spec, so translate that carefully.

The language designer has given you a token spec and grammar, so there shouldn’t be much work to do here other than reformatting the ANTLR grammar and giving a name for each production rule.

Next, I recommend removing all the parse tree node visiting methods from the inner classes in Interpreter.java, since the names of those methods will all need to change anyway compared to the calc language. Your goal is to get a minimal interpreter that compiles even though it won’t actually do anything beyond parsing the input file.

Once you get that to compile, your interpreter should be able to read in any valid program in the target language. But it won’t do anything yet, because you haven’t written the visit methods to actually go through and execute the parse tree.

So, the last thing to do is gradually fill in those methods, starting with just the few methods needed to execute the simplest possible program, and testing everything as you go.

Error handling

Depending on the language, some programmer errors will be possible, for example, using a variable name before assigning it a value. The Errors.java class is there to help you with this — your interpreter should call Errors.error("some explanation") when it identifies a run-time error in the target language.

We won’t focus on the error handling too much in this unit, however. So it’s okay to mostly make sure your interpreter works correctly for valid programs.

Task 3: Testing

Ideally you should be testing as you go, building up some example programs of your own that work out each part of the language as you add the methods to support it. With that kind of approach, you should have confidence your interpreter works 100% at the end.

Moreover, as our languages get bigger, it will be harder to quickly check every possible language construct in a few lines. So you will want to stay organized and make sure you don’t break old features when you add the code for new ones.