Grades and deadlines

Initial deadline: 23:59 on Wednesday 05 November
How to submit: Turn in your completed lab3.2.yml file, along with everything in the src/main folder of your maven project.

You must submit via the command line in order to preserve the folder structure. Use either of these two commands to do it:
```
club -csi413 -plab3.2 -f lab3.2.yml src/main
```
or
```
submit -c=si413 -p=lab3.2 lab3.2.yml src/main
```
(you can download the club tool here)

If you used an AI tool to help with this lab (you should use Gemini), turn in a file aichat.md as well. Remember the course guidelines for the use of generative AI on labs.
Grading:

In this lab you will complete the following tasks:
1. Starting with the code for your working interpreter from the previous lab, write a compiler that reads in source code in your chosen language and produces an equivalent program in LLVM IR.
2. Thoroughly test your compiler by running the produced .ll programs using lli or clang.
If your submission meets the requirements for each task, you will receive 3 points towards your total lab grade.
Collaboration

Because you are all working from the same AST nodes, everyone’s task is more or less the same for this lab.

Because of that, you may not share your Java code, or look at anyone else’s Java code, for this lab.

However, you should feel free to discuss the lab with classmates (without looking at Java code), diagram out your solution, etc., as long as that collaboration is clearly documented in your YAML file.

Resubmissions:

We will follow the same resubmission policy for all labs this semester:

def points_earned(deadline, max_points, previous_submission=None):
    while current_time() < deadline:
        wait()
    submission = get_from_submit_system()
    if meets_all_requirements(submission):
        return max_points
    elif significant_progress(previous_submission, submission):
        return points_earned(deadline + one_week, max_points, submission)
    else:
        return 0

The big idea

In the previous lab, you were given working code for a complete AST, and were asked to write the code which would generate the AST from a parse tree in your language. The interpreter just did scanning, parsing, AST generation (semantic analysis), and then called exec() on the root AST node.

In this lab, almost all of that will stay the same: scanning, parsing, and AST generation.

But what will be completely different is what happens with the AST itself. The Stmt and Expr interfaces both have a new method compile() which needs to be implemented in every AST node. The job of each compile method is to output LLVM IR code corresponding to whatever that AST node does, and to recursively compile any child nodes.

A lot of the AST compile() methods have already been filled in for you, because they are doing the same things as you already got working in the previous two compiler labs.

Your focus in this lab will be on three tasks:

Understanding how your scanner, parser, and AST generator from the last lab fits in to the AST-based compiler in the starter code that you are given, and how it all works together.
Implementing variable assignment and reference, using memory in LLVM. Remember as we have learned in class, that once our code has branching, we can no longer just store everything in registers. The compiler will need to allocate memory to store variable values the first time they are assigned.
Implementing if statements and while loops, using branching instructions in LLVM

There is a lot less code to write in this lab compared to some previous labs, but the code is getting more intricate and tricky. So you might have to think and plan more than before. Fortunately, this is the most fun part of programming and the reason computer science is a thing!

Getting started: files

Start by downloading the starter files. Running this command will create a new directory lab3.2:

git clone https://github.com/si413usna/startlab.git -b lab3.2 lab3.2

As usual, you should see an empty lab3.2.yml file for you to fill in, plus a pom.xml file for maven, and a src/ directory with all the starter code.

Remember that we said the scanning, parsing, and AST generation will work exactly the same as in the previous lab?

Well, that corresponds to you copying these three files from the previous lab, into the new lab:

src/main/resources/si413/tokenSpec.txt
src/main/antlr4/si413/ParseRules.g4
src/main/java/si413/ASTGen.java

After getting all the files set up, try running your compiler on a simple program (with no variables, if statements, or loops) and testing the resulting .ll code using lli or clang. It should work!

Remember, if you have some example program in example.prog, you can test your compiler like this on the command line:

./run.sh example.prog example.ll
lli example.ll

Task 1: Write your compiler

Most of your work will be in Compiler.java, Expr.java, and Stmt.java. Concretely, your goal is to fill in the missing compile() methods. Each missing method in the starter code has a // TODO comment to help you spot them.

You need to basically implement variables and control structures, and that’s it! Remember what we have been doing recently in class: allocating memory (for variable storage), and implementing branching (for if statements and while loops).

Now you will need to work at one more level of abstraction, writing Java code that produces LLVM code which deals with memory allocation and branching.

This is all the guidance we are going to give here! Look through the code that you are given in Compiler.java, Expr.java, and Stmt.java to help get you started.

You are ready for this challenge. Good luck!

Extra challenge

To make your compiler extra awesome, try the following two enhancements:

Implement “short circuit” evaluation for and and or statements. Meaning, when the first argument to the and/or operation already tells you what the result will be, you shouldn’t need to bother evaluating the second argument. (Hint: more branch instructions!)
Right now all the existing string stuff that the compiler produces is allocating new memory on the heap using malloc, and that memory is never freed. Meaning, the compiled code is a giant memory leak.

Fix this so that your compiler produces memory-safe code where all memory gets free’d when it’s no longer needed, before the program finishes.