Grades and deadlines

Initial deadline: 23:59 on Tuesday 28 October
How to submit: Turn in your completed lab3.1.yml file, along with everything in the src/main folder of your maven project.

You must submit via the command line in order to preserve the folder structure. Use either of these two commands to do it:
```
club -csi413 -plab3.1 -f lab3.1.yml src/main
```
or
```
submit -c=si413 -p=lab3.1 lab3.1.yml src/main
```
(you can download the club tool here)

If you used an AI tool to help with this lab (you should use Gemini), turn in a file aichat.md as well. Remember the course guidelines for the use of generative AI on labs.
Grading:

In this lab you will complete the following tasks:
1. Design a language (or modify one from previous labs) that supports all of the string and bool features from the Unit 2 labs, plus if statements and while loops.
2. Write an example program demonstrating how your language works, along with sample input/output values to test your example program
3. Write another program in your language that exactly matches the behavior of the “dance program” example given below.
4. Implement an interpreter for your new (modified) language, by writing up the token and grammar spec for your language, then writing code to convert the parse tree to an abstract syntax, using AST node classes provided in the starter code
5. Thoroughly test, then submit your code along with a completed YAML file
If your submission meets the requirements for each task, you will receive 12 points towards your total lab grade.
Collaboration

The same rules apply as with your interpreter, across all sections of SI413:
- If someone is working on the same language as you, you should not be collaborating with them since you are doing the same task.
- If someone is working on a different language than you, then it is OK for you to help each other with debugging your code for the lab, as long as this collaboration is clearly documented in the code and in your YAML file.

Resubmissions:

We will follow the same resubmission policy for all labs this semester:

def points_earned(deadline, max_points, previous_submission=None):
    while current_time() < deadline:
        wait()
    submission = get_from_submit_system()
    if meets_all_requirements(submission):
        return max_points
    elif significant_progress(previous_submission, submission):
        return points_earned(deadline + one_week, max_points, submission)
    else:
        return 0

Getting started: files

Start by downloading the starter code. Running this command will create a new directory lab3.1:

git clone https://github.com/si413usna/startlab.git -b lab3.1 lab3.1

In there you will find:

lab3.1.yml: for you to fill in and submit
pom.xml: config file for maven
run.sh: script to run your interpreter
dance_program.py: A program you need to translate into your language from Python (and write your version in the YAML file).
src/main/resources/si413/tokenSpec.txt: Example token spec file for a small subset of Python. You will replace this with the tokens for your language.
src/main/antlr4/si413/ParseRules.g4: Example ANTLR grammar for a small subset of Python. You will replace this with the grammar for your language.
Java code under src/main/java/si413/:
- ASTGen.java: Most of your work will be in this file. Here should be the logic to visit the parse tree nodes and create AST nodes.
- Stmt.java and Expr.java: All of the AST node classes for statements and expressions. You need to look at and understand the code here, but shouldn’t need to make changes.
- Interpreter.java: The main class for your interpreter. You should look at this, but probably won’t need to make changes.
- Errors.java: Error handler for tokenizer, parser, and compiler (you shouldn’t need to edit this)
- Tokenizer.java: Tokenizer logic (don’t edit this either)

Background: Java record syntax

In the AST node classes Stmt.java and Expr.java we use a syntax in Java that might be unfamiliar to you: Java records.

There official documentation on this is pretty straightforward and I encourage you to check that out.

A record is simply a shorthand to make our code have less boilerplate. It’s really common to make a Java class where:

You have some fields (types and names of class variables)
The exact same fields show up as arguments to your constructor
Each field also has its own getter method

Rather than having to repeat yourself three times, the record syntax was introduced to save typing and make your code more readable to handle this common use case.

Concretely, take this example, which is one of the AST node types from Stmt.java:

record AssignString(String name, Expr<String> child) implements Stmt {
    @Override
    public void exec(Interpreter interp) {
        String val = child.eval(interp);
        interp.getStringVars().put(name, val);
    }
}

This short record declaration is equivalent to the following class definition:

public class AssignString implements Stmt {
    // private fields auto-filled in from record arguments
    private String name;
    private Expr<String> child;

    // public constructor auto-created, matching record arguments
    public AssignString(String name, Expr<String> child) {
        this.name = name;
        this.child = child;
    }

    // public getter methods auto-created from record arguments
    public String name() { return name; }
    public Expr<String> child() { return child; }

    // a few more common methods auto-filled in for you,
    // like toString() and equals()

    // and finally whatever methods are specified in the record definition
    @Override
    public void exec(Interpreter interp) {
        String val = child.eval(interp);
        interp.getStringVars().put(name, val);
    }
}

Because there are around a dozen different type of AST nodes, it would be a real pain to have to write out all the fields, constructors, getters, equalitiy check, etc. for every single one. The record syntax in Java just saves us the hassle and makes the code more readable for these straightforward kinds of classes.

Task 1: Design your language

You need a language that supports all of the string and boolean operations from Lab2.1, but now must additionally support if statements and while loops.

Your language can be derived from some language in prior weeks, but it still needs to be original, and have an original name.

Even though there is not too much to add with if statements and while loops, you still have some interesting choices:

Does your language have just if statements, or if/else statements, or both kinds?
How does the programmer indicate where the “body” of the if statement or while loop begins and ends?

Fill in the langauge_name and language_description fields in the YAML file.

(Note: Unlike in previous units, you do not need to write out the syntax of your language. That is because you will also be turning in a working interpreter that will have to include the token spec and grammar files anyway.)

Task 2: Write an example program.

Write a complete example program that demonstrates all the features of your language. Make good use of variable names, spacing, and comments so that your code is easily readable and understandable.

Be sure to especially demonstrate how your if statements and while loops work! Your example program must include at least one nested if statement and at least one nested while loop.

Fill in the example_program field, as well as example_input1, example_output1, example_input2, and example_output2, in the YAML file.

We will use your own code example and input/output to test your interpreter, so make sure it’s precise. Come back and check it after you get your interpreter working!

Task 3: Write the dance program in your language

Here is a simple Python program that demonstrates the various boolean, string, and control flow features your language should support:

print("enter two dance moves")
x = input() + "m" + input()
print("Let's do the " + x)
y = ("mb" in x)
if x < "mambo":
    print(x[::-1])
    y = False
print(y)
print("a" in x and not y)
timer = ""
while not ("..." in timer):
    print("cha")
    timer = timer + "."

You need to write a program in your language which is exactly equivalent to this one in terms of its run-time behavior. That is, running your program in your (eventually) working interpreter, should be the same as running the code above using the python3 interpreter.

Fill in the dance_program field in the YAML file. You don’t need to provide sample input/output pairs for this one.

Task 4: Write your interpreter

The starter code you downloaded has a working implementation of an interpreter for a very small subset of Python. For example, it would work on a two-line program like this:

print("one" + "two" + "three")
print(not not not True)

This is way less powerful than the interpreters you already wrote in the previous unit. But what’s different now is how this interpreter works!

Rather than directly executing the code while walking the parse tree in the various visitX() methods, this interpreter first creates an AST, and then calls the exec() method on the root node of the AST to run the program.

Here’s how it works specifically:

Interpreter.java has the main method, and it follows the steps of interpretation we’ve discussed in class fairly exactly: first tokenizing (scanning), then parsing (to create a parse tree), and then converting this parse tree to an AST, and finally executing the AST.

As before, the scanning and parsing are done based on the tokenSpec.txt and ParseRules.g4 files that live in their own special subdirectories of the maven project.
Stmt.java and Expr.java contain all of the code for the AST nodes.

These AST nodes are complete and should cover your entire program, even though just a couple of them are used for the small Python example that works right now. You should not need to make changes here.
ASTGen.java is where the crucial connection happens: converting the ANTLR-generated parse tree to a single AST. This is where you will do most of your coding work for this lab.

So what you will need to do is:

Look through the AST nodes in Stmt.java and Expr.java to get an idea of how they are connected and how they work. You shouldn’t need to change these classes themselves, but you will need to write the code that creates all of these AST nodes and connects them together.
Wipe out the existing tokenSpec.txt and parseRules.g4 files, and replace them with the tokens and grammar for your new language.
Get to work in ASTGen.java in writing all the visit methods for your grammar rules.

But unlike in the previous unit, the actual logic to run your program and compute the various strings and bools should not go here. That logic is now in the AST nodes, and it’s already been written for you!

Instead, the visitX() methods will now all return AST node objects from Stmt.java and Expr.java. Instead of executing the actual program, your parse tree visitor methods are just converting each part of the parse tree to an equivalent (but simpler!) AST.

As usual, starting small and testing along the way will be key to your success. I recommend starting with something super simple, like printing a single boolean literal. Write a tiny example program file that just does that, then try to fill in enough in your Java code and token/grammar specs to make that work. Then add a little more, and a little more, testing along the way so you don’t get stuck.

Task 5: Test and submit

Once you think your interpreter is complete, you need to thoroughly test it and make sure it works perfectly. Of course, you were already testing your code as you went along, so this shouldn’t be a huge step.

Be sure to test your code on the two example programs you wrote earlier. That is also what we will use to test your code when you submit!

Follow the submit commands at the top of this page to turn everything in.

Optional fun challenge: Numbers!

Your programming language now supports strings, booleans, variables, conditionals (if statements), and looping. But still it doesn’t have any kind of integer type or arithmetic.

We could easily add numbers to the tokens, grammar, and interpreter that you wrote. And in fact we will add numerical support to your languages in a later unit this semester.

But this challenge isn’t about modifying the language further, it’s about using the language as-is to emulate numerical computation using strings. Thanks to the features you have now added (variables, if statements, and loops), your existing language and interpreter are able to do math with strings. Yes, it’s really possible!

I hope this is a fun challenge, but it’s also a valuable exercise that’s important when dealing with new programming languages or restricted environments: Given the limited (computing) tools you have, what’s the most you can do? This is actually the spirit of a “minimalistic” language like Scheme. So let’s practice on the languages you all have created!

Challenge part 1: Unary math

For starters, we will represent numbers in unary, as just a series of x’s. So like the number 5 would be represented by this string:

xxxxx

See if you can write a program in your language that reads in two unary numbers like this, then performs some basic arithmetic on those “numbers” and prints out the results in unary.

Specifically, your program should:

Read in two lines of input, each containing a unary number. The first number should be greater than or equal to the second one.
Add those two numbers and print out their sum, also in unsary.
Subtract the second (smaller) number from the first one, and print out the result in unary.

So for example, xxxxxxxx minus xxxxx would equal xxx.
Multiply the two numbers, and print the product in unary.
Divide the two numbers, and print out the quotient and remainder in unary.

Here is a complete sample run of how your program should work. In this example, the input numbers are 17 and 5; their sum is 22, difference is 12, product is 85, quotient is 3, and remainder is 2.

enter two numbers with x's:
xxxxxxxxxxxxxxxxx
xxxxx
sum:
xxxxxxxxxxxxxxxxxxxxxx
difference:
xxxxxxxxxxxx
product:
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
quotient:
xxx
remainder:
xx

Challenge part 10: Binary

Now do the same, but make it more efficient with a binary representation of numbers.

Because these are just strings you could use any two characters you like for 0 and 1. I used o and x.

Try to get the same four operations working as before. It will be tough! But it’s definitely possible, and even trying to work through this will stretch your thinking muscles.