Summary

If/else example

Consider the following simple C function:

int foo(int x, int y) {
  int z;
  if (x < y) {
    z = x + y;
  }
  else {
    z = 15;
  }
  return z;
}

Because LLVM is a three-address code (3AC), we shouldn’t expect to have exactly the same code structure when we compile this to LLVM IR. The 3AC language won’t have nested block statements and if/else/while; instead, there are just branch instructions.

Here is how the foo function above might be translated to LLVM IR:

define i32 @foo(i32 %x, i32 %y) {
top:
  %zaddr = alloca i32
  %cond = icmp slt i32 %x, %y
  br i1 %cond, label %ifblock, label %elseblock

ifblock:
  %sum = add i32 %x, %y
  store i32 %sum, ptr %zaddr
  br label %end

elseblock:
  store i32 15, ptr %zaddr
  br label %end

end:
  %z = load i32, ptr %zaddr
  ret i32 %z
}

Read carefully through the code and try to understand how it works.

There are three things here which are new to us in LLVM:

And that’s it! This is actually all that you need to implement if statements, if/else statements, while loops, for loops, and any of these nested within each other.

Basic blocks

In LLVM IR, all the instructions inside a function can be divided into one or more basic blocks, each of which starts with an ordinary instruction and ends with a special https://llvm.org/docs/LangRef.html#terminator-instructions, usually either a br or ret.

Before today, the functions we wrote in LLVM just consisted of a single basic block, since they didn’t contain any branches.

Every basic block in LLVM also has a label which is assigned as some register value. This actually happens even if you don’t write a label name yourself. For example, when we ask clang to compile the foo function example above, it starts like this:

define dso_local i32 @foo(i32 noundef %0, i32 noundef %1) #0 {
  %3 = alloca i32, align 4
  ; ...

Look carefully at the register names. We know the compiler starts with %0 and just counts up sequentially, so why does it skip register %2?

It’s because %2 is implicitly assigned to hold the label for the first block, that we called top: in the example above.

Also notice that each basic block is kind of self-contained. From the way the br function is designed, there is no “fall-through” or “keep going” branch option. If you have a br instruction, it is definitely going to jump to some label; the question is just (in a conditional branch) which label it goes to.

Control Flow Graphs (the other CFG)

The reason that LLVM, and indeed most IR languages, have this notion of basic blocks is for the compiler to be able to reason about the control flow of the code.

This is enabled by something called the control flow graph or CFG. (To be clear, this is a general concept in compiler design that is not just about LLVM IR; however we will look at it through the LLVM context since that is what we are using in this class.)

The CFG follows all possible execution paths within a function between the basic blocks. Specifically, each node in the CFG is a basic block, indicated by its label, and each directed edge represents a possible execution path from the end of that basic block to the beginning of another one.

(Note: CFG is now an overloaded term in this class, because it also stands for Context-Free Grammar in the context of parsing. Hopefully the context (haha) will make the distinction clear.)

Here is the CFG for the foo function above:

The need for memory

The CFG is very useful for the compiler (and us!) to understand and optimize the LLVM IR code.

One important thing the CFG tells us is which registers can be referred to in each part of the code.

Obviously you can’t refer to a register before it’s assigned. Consider this small LLVM program, where we mixed up the order of the first two lines:

define i32 @main() {
  %four = mul i32 %two, %two
  %two = add i32 1, 1
  ret i32 %four
}

If you try to run this small LLVM program using lli, it will give you an error “Instruction does not dominate all uses!”, telling you that the register %two is accessed before it is defined.

That one is pretty easy to catch, but what if in the above foo program, we change the last line to

ret i32 %sum

This looks okay: the %sum variable is assigned on line 8, and then this ret instruction is further down, on line 18.

But it’s actually not okay! We get the same “does not dominate all uses” if we run the modified version through lli, and for (essentially) the same reason. But now we have to look at the CFG to understand why:

This is actually the reason why the variable z is stored in memory using alloca and %zaddr, instead of just storing this value in registers. Because LLVM is an SSA language, we can’t assign the same register more than once in a function, and that means that it would be impossible to store the register value as x + y in the “if” case, and as 15 in the “else” case. Instead, we use a memory location for the variable z, do a store command in the if and else blocks to update memory in two different ways, and then load the resulting value at the end before returning it.

More complex example

Here is a function (written in Python syntax) that computes the smallest prime factor p of an integer n:

def leastprime(n):
    p = 2
    while p*p <= n:
        if n % p == 0:
            break
        p += 1
    if p*p > n:
        p = n
    return p

Challenge yourself! See if you can come up with some equivalent LLVM IR code for this function, and then draw the CFG for your code.

LLVM IR code for leastprime
define i64 @leastprime(i64 %n) {
top:
  %paddr = alloca i64
  store i64 2, ptr %paddr
  br label %whilecond

whilecond:
  %p1 = load i64, ptr %paddr
  %psqr1 = mul i64 %p1, %p1
  %whilecheck = icmp slt i64 %psqr1, %n
  br i1 %whilecheck, label %whilebody, label %whiledone

whilebody:
  %p2 = load i64, ptr %paddr
  %mod = srem i64 %n, %p2
  %zerocheck = icmp eq i64 %mod, 0
  br i1 %zerocheck, label %whiledone, label %increment

increment:
  %p3 = load i64, ptr %paddr
  %pplus = add i64 %p3, 1
  store i64 %pplus, ptr %paddr
  br label %whilecond

whiledone:
  %p4 = load i64, ptr %paddr
  %psqr4 = mul i64 %p4, %p4
  %bigcheck = icmp sgt i64 %psqr4, %n
  br i1 %bigcheck, label %updatep, label %end

updatep:
  store i64 %n, ptr %paddr
  br label %end

end:
  %p5 = load i64, ptr %paddr
  ret i64 %p5
}
Control Flow Graph for leastprime