Unit 2: Recursion
1 Recursion
The idea of writing and using recursive functions was introduced all the way back in your IC210 or SI204 class. At that time, recursion may have seemed like a “neat trick” and opportunity to stretch your understanding of code, but not really something useful. That’s because, at the level of IC210, everything you did with recursion could also be done almost just as easily with a loop.
Well, Data Structures (and most especially, trees) is where recursion becomes really, really useful. When we have more sophisticated data structures and algorithms, we need recursion to write simple code that works and we can understand.
The simple fact is this: with many, many data structures, recursion is WAY easier to cope with than iteration is. Embrace it, because it’s about to make your life easier. Today, then, is an IC210-level reminder of what recursion is, and how it works
Recursion is a function calling itself. So, if we understand how function calls work, we’ll understand recursion a lot better. Imagine the following set of functions (in pseudocode):
f1() {
print "inside f1"
}
f2() {
print "starting f2"
f1()
print "leaving f2"
}
main() {
print "starting main"
f2()
print "leaving main"
}
If we were to run this program, the output would be:
starting main
starting f2
inside f1
leaving f2
leaving main
To understand why, let’s think about the call stack. The call stack works like a stack of plates: plates can only be placed on the top, and can only be removed from the top. For this reason, they’re often referred to as LIFO (last-in-first-out). Instead of plates, though, the call stack has the functions that are running; when a function is called, it is placed on the top of the stack. A function ONLY runs when it is at the top of the stack; the rest of the time, it sits and waits for the one above it to return.
Side note: in computer architecture you learned about how this actually works internally: The CPU maintains a special stack pointer register, which gets moved every time a function call begins to “make room” for that function’s saved registers and other memory usage, and the stack pointer is “restored” when the function returns.
Here we won’t worry about those low-level details, but the concept is the same. One cosmetic difference is that, in this class (as in IC210 and elsewhere not focused on assembly details) our stacks will always grow up, with the bottom of the stack on the bottom and the top on the top.
So, we start running the program. Obviously, it prints “starting main.” At this point, our call stack looks like this:
\[ \array{ \boxed{\texttt{main()}} } \]
Our next line is to call the function f2. So, f2 gets put on the top of the stack:
\[ \array{ \boxed{\texttt{f2()}} \\ \boxed{\texttt{main()}} } \]
We’ve done enough programming now to know that now f2 is running, not main. The reason, though, is because now f2 is on the top of the call stack, so main just sits and waits until f2 returns.
So, f2 runs, printing “starting f2.” It then calls f1, leading to this stack:
\[ \array{ \boxed{\texttt{f1()}} \\ \boxed{\texttt{f2()}} \\ \boxed{\texttt{main()}} } \]
So, f1 runs, and f2 (and main) sits and waits. f1 prints “inside f1,” then returns, “popping” off the stack, taking us back to this: \[ \array{ \boxed{\texttt{f2()}} \\ \boxed{\texttt{main()}} } \]
f2, back on the top of the stack, gladly continues where it left off, printing “leaving f2,” before returning and popping off the stack itself:
\[ \array{ \boxed{\texttt{main()}} } \]
Back on the top of the stack, main now finishes, printing “leaving main,” and returning, ending the program.
OK! So, recursion. Let’s consider the standard first-recursion-example of finding a factorial using recursion:
int fac(int n) {
if (n == 0)
return 1;
else
return n * fac(n-1);
}
If fac(3)
is run, it runs through happily, until it gets to “fac(n-1)
”, and which point, fac(2)
is placed on top of the stack. fac(2)
begins to run, and fac(3)
sits and waits for it to return. fac(1)
is then placed on top, and fac(2)
and fac(3)
sit and wait. Finally, briefly, fac(0)
is placed on the stack, just long enough to return 1
.
The story so far looks like this:
\[ \array{ \phantom{\boxed{\texttt{f}}} \\ \phantom{\boxed{\texttt{f}}} \\ \phantom{\boxed{\texttt{f}}} \\ \boxed{\texttt{fac(3)}} } \longrightarrow \array{ \phantom{\boxed{\texttt{f}}} \\ \phantom{\boxed{\texttt{f}}} \\ \boxed{\texttt{fac(2)}} \\ \boxed{\texttt{fac(3)}} } \longrightarrow \array{ \phantom{\boxed{\texttt{f}}} \\ \boxed{\texttt{fac(1)}} \\ \boxed{\texttt{fac(2)}} \\ \boxed{\texttt{fac(3)}} } \longrightarrow \array{ \boxed{\texttt{fac(0)}} \\ \boxed{\texttt{fac(1)}} \\ \boxed{\texttt{fac(2)}} \\ \boxed{\texttt{fac(3)}} } \longrightarrow \array{ \texttt{return 1} \\ \boxed{\texttt{fac(1)}} \\ \boxed{\texttt{fac(2)}} \\ \boxed{\texttt{fac(3)}} } \]
At this point, fac(1)
picks up where it left off, multiplying its n = 1, which is what fac(0)
, and returning that solution. fac(2)
then multiplies that returned value of 1 by its n = 2, and returning the solution. fac(3)
then multiplies that 2 by its n = 3, returning a final answer of 6.
The second half of the story looks like this:
\[ \array{ \texttt{return 1} \\ \boxed{\texttt{fac(1)}} \\ \boxed{\texttt{fac(2)}} \\ \boxed{\texttt{fac(3)}} } \longrightarrow \array{ \phantom{\boxed{\texttt{f}}} \\ \texttt{return 1} \\ \boxed{\texttt{fac(2)}} \\ \boxed{\texttt{fac(3)}} } \longrightarrow \array{ \phantom{\boxed{\texttt{f}}} \\ \phantom{\boxed{\texttt{f}}} \\ \texttt{return 2} \\ \boxed{\texttt{fac(3)}} } \longrightarrow \array{ \phantom{\boxed{\texttt{f}}} \\ \phantom{\boxed{\texttt{f}}} \\ \phantom{\boxed{\texttt{f}}} \\ \texttt{return 6} \\ } \]
1.1 Tips for writing recursive methods
We’re going to need a base case (the thing that makes everything stop running; in the above factorial example, it’s if n == 0), and at least one recursive case. The base case is usually easier: what’s an input that makes this problem really easy? Write that first, then make the recursive case build towards it.
2 Linked List traversal
Let’s define a Linked List of int
’s as follows:
public class LinkedList {
private class Node {
int data;
Node next;
Node(int data, Node next) {
this.data = data;
this.next = next;
}
}
private Node head, tail; // first and last Node in the list
//... remaining methods
}
Using that definition, a traversal of the list iteratively would look something like this:
void traverse() {
Node cur = head;
while (cur != null) {
//... perform some action on cur
cur = cur.next;
}
}
and here is a similar process, this time using recursion:
void traverse() {
traverse(head);
}
void traverse(Node cur) {
if (cur == null)
return;
//... perform some action on cur
traverse(cur.next);
}
Calling traverse()
in the iterative version or the recursive version results in the same actions being performed, on the same Nodes, in the same order. Make sure you understand how this works. Soon, there will be data structures where the iterative version is pages long, and the recursive version is five lines. You want to know and understand this.
As an example of a recursive traversal and action that we can take is a length()
method.
int length(Node cur) {
if (cur == null)
// length of an empty list is zero
return 0;
// length of a non-empty list is one more than the length when you
// chop off the head
return 1 + length(cur.next);
}
Another example is to add to a sorted linked list. For example, if we have a linked list holding the integers 1, 2, and 4, after running addInOrder(3)
, we would have a linked list with the Integers
1, 2, 3, and 4.
public void addInOrder(int element) {
head = addInOrder(head, element);
}
//This is a helper method, so our user doesn't have to know or care
//about nodes.
private Node addInOrder(Node cur, int element) {
if (cur == null) {
// we've reached the end of the list, so make a new tail
tail = new Node(element, null);
return tail;
} else if (element < cur.data) {
// the new element comes before the current node, so insert it here.
// Note that this constructor sets the "next" of the new node to "cur".
return new Node(element, cur);
} else {
// otherwise the new element comes after cur, so use recursion!
cur.next = addInOrder(cur.next, element);
return cur;
}
}
Make some examples to try the above on, and understand why it works! Your homework will go better.
2.1 Reversal of traversal
Okay, let’s try to do something a little more complicated: go through the elements of a linked list from last to first.
Of course, this is easy if we have a “doubly-linked list”, where each node has both a next
and a previous
pointer — then you could just start from the tail
and go backwards to the head
.
But doubly-linked lists are harder to maintain (more pointers to keep track of!) and also consume up to 2x as much memory. Plus, we don’t always want to go changing everything about the data storage just to add one more function. Take a moment to think: how would you write a function to traverse a (singly-)linked list in reverse order?
With iteration, this is kind of hard! The trouble is, you can easily get to the last thing in the list, but then how do you get to the second-to-last, and the third-to-last, etc?
Here’s one way to do it, which isn’t necessarily the best way, but anyway the code is relatively short. The idea is to keep track of a stop
Node, which is the node of whatever was just printed out. Then we move forward the cur
node from head
until just before the stop
, and that effectively steps backwards through the list.
void traverse_reverse_iter() {
Node stop = null;
while (stop != head) {
Node cur = head;
while (cur.next != stop)
cur = cur.next;
//... perform some action on cur
stop = cur;
}
}
Yes, this really works — try it if you want! But I think you’ll agree it’s pretty tricky to think about and to get right.
And besides that, this method is slow. Can you use your big-O skills to analyze it? In the inner while
loop, we are moving the cur
node ahead one step at a time until the stop
node, which is at most \(O(n)\). For the outer while
loop, the stop
node starts at the end of the list and effectively moves backwards one step every iteration, so that’s another \(O(n)\). These loops are nested, so we get \(O(n^2)\) total — quadratic time.
Not great! Let’s check out the recursive version of the same thing:
void traverse_reverse_rec(Node cur) {
if (cur == null)
return;
traverse_reverse_rec(cur.next);
//... perform some action on cur
}
(Of course we would probably want to also add a no-argument version of the function which starts with cur=head
, like before.)
Yes, those two lines of glory really work! Think about it carefully, work through a small example, and compare to the recursive version of forward traversal from above.
What you should notice is that the only difference here is the order of the last two lines. For forward iteration, we do something with cur
, then make a recursive call on cur.next
. In reverse iteration, we just do the opposite! No extra code, no fancy nested loops, just wonderful recursion.
It seems like traverse_reverse_rec
is much faster than traverse_reverse_iter
. But is it really? You should probably have a good guess about the big-Oh running time of the recursive version already, but let’s take some time to think about how we analyze recursive functions in general…
Aside: The real reason the recursive version can be so much simpler is because of the function call stack. Remember how, for the iterative version, we struggled to think about how to go backwards once we’ve reached the end of the list? Well, with recursion, we know exactly where the prior node to
cur
is — that’s the value ofcur
in whatever function recursively called this one, i.e., the function right below the current one on the stack.We could get the same performance out of an iterative version, but we would have to explicitly recreate the function call stack ourselves to store each node as we go through the loop.
So you see, the function call stack is the real power behind recursion that lets us write much simpler, clearer, even faster code. But it’s also something to be wary of — the stack space is limited, and many languages including Java have a recursion limit built in. So recursing a few hundred times is going to be fine, but many thousands or millions of recursions probably won’t work. It’s one of many tradeoffs to consider when designing a data structure!
3 Big-O with Recursion
3.1 Determining the Big-O of a Recursive Function
We have some practice identifying the Big-O of iterative algorithms. However, now that we’re (again) experts on recursion, it’s important that we be able to identify the Big-O of recursive algorithms. We’re not going to spend a huge amount of time on this; that’s Algorithms’ job (note for IT people: consider taking Algorithms. It’s an Important Class). However, it is important that you can identify some of the most common patterns.
Let’s start with an example: linked-list traversal.
void traverse(Node node) {
if (node == null) //base case
return;
//... perform some action on node
traverse(node.next) //recursive case
}
Obviously, we have two cases; a base case, and a recursive case. We’re going to write a function \(T\) that describes the number of steps taken in each case as a function of the size of the input \(n\). The input for this example will be the length of the list node
is the head of (because every node is the head of some linked list, right?). The base case can then easily be defined as when \(n=0\), or the length of the list starting with the current node is 0, as would occur when node
is null. The base case is easily defined then as:
\[T(0) = 2\]
This is because when \(n=0\), there is a constant amount of work to do (namely, do the comparison in the if-statement, and perform the return statement). Ultimately, this constant 2 won’t really matter in the big-O, but when we’re dealing with recursion it’s safer not to do “let’s ignore all coefficients and numbers because it’s Big-O” until the end.
Now what about the other case - if \(n > 0\)?
\[T(n) = 2 + T(n-1)\]
That is, it is two constant steps plus doing the work of performing the work on the next node in the list. This is because when \(n > 0\), we do some constant amount of work (if statement comparison, function call to traverse
, etc), then run it all again on the remainder of the list.
The two of these functions together are known as a recurrence relation. So the recurrence relation of this algorithm is:
\[\begin{align*} T(0) &= 2 \\ T(n > 0) &= 2 + T(n-1) \end{align*} \]
The same exact thing can also be written as
\[ T(n) = \begin{cases} 2,& n=0 \\ 2 + T(n-1), & n > 0 \end{cases} \]
3.2 Solving Recurrence Relations
Our five-step process for solving a recurrence relation is:
- Write down the recurrence relation.
- Expand the recurrence relation, for several lines.
- Figure out what you would write for the \(i\)-th line.
- Figure out what value of \(i\) would result in the base case.
- In your equation from (3), replace \(i\) with the solution from (4).
OK, so we’ve written it. Let’s do step 2.
\[\begin{align*} T(n) &= 2 + T(n-1)\\ &= 2 + \underbrace{2 + T(n-2))}_{T(n-1)}\\ &= 2 + 2 + \underbrace{2 + T(n-3)}_{T(n-2)}\\ \end{align*}\]
And so on. But this is probably enough to recognize the pattern. Let’s do step 3. Hopefully you agree that on the i-th line, we’d have:
\[T(n) = 2i + T(n-i)\]
So when does this stop? Well, that’s where the other part of our recurrence relation comes in: our base case equation tells us that when \(n=i\), it’s just \(2\). So, we figure out what value of \(i\) makes the \(T(n-i)\) equal to our base case of \(T(0)\). In this case, that happens when \(i=n\) because when \(i=n\) then \(n-i=0\).
Finally, we do step five, and take our equation \(T(n) = 2i + T(n-i)\), and replace \(i\) with our solution of step 4 (\(i=n\)), to get \(T(n) = 2n + T(0)\). Because \(T(n-n) = T(0)=2\), the runtime of that equation is
\[2n+2\]
which is \(O(n)\).
3.3 A Simpler Example
Let’s do one more, and this time, let’s do the simple recursive sum function:
int sum(int n) {
if(n==0) return 0;
else return n+sum(n-1);
}
In the base case, we can say one step occurs, since all we are doing is returning 0 and this occurs when \(n=0\).
\[ T(0) = 1 \]
The number of steps in the recursive case (when \(n>0\)) is one sum, one subtraction, and the function call — so three steps plus the number of steps in the recursive call.
\[ T(n) = 3 + T(n-1) \]
We can now write the recursive relation:
\[ T(n) = \begin{cases} 1,& n=0 \\ 3 + T(n-1), & n > 0 \end{cases} \]
Let’s now identify the pattern:
\[\begin{array} TT(n) &= 3 + T(n-1) \\ &= 3 + 3 + T(n-2) \\ &= 3 + 3 + 3 + T(n-3) \\ & \ldots \\ T(n) & =3i + T(n-i) \end{array}\]Substituting for \(i\) the value \(n\), we get
\[ 3n + T(n-n) = 3n+T(0) = 3n+1 \]
And clearly, \(3n+1\) is \(O(n)\)
3.4 A Harder Example
Before you saw an iterative version of binary search. Here’s a slightly different recursive version, but the principle is the same. Let’s now use recurance relations to identify the Big-O.
boolean binarySearch(int[] A, int toFind) {
return binarySearch(A, 0, A.length, toFind);
}
boolean binarySearch(int[] A, int beg, int end, int toFind) {
if(end-beg == 1)
return toFind==A[beg];
int mid = (beg + end) / 2;
if (toFind < A[mid]) return binarySearch(A, beg, mid, toFind);
else return binarySearch(A, mid, end, toFind);
}
Let’s say that \(n\) is defined as the distances between beg
and end
, so the base case is when \(n == 1\). There are two recursive cases we might use, but we’ll only use one of them, and they analyze identically. So if we count steps
\[T(n) = \begin{cases} 7,& n = 1 \\ 6 + T\left(\tfrac{n}{2}\right),& n > 1 \end{cases}\]
(*If you disagree or come up with different values for 6 and 7, remember that those constants do not actually matter at all for the big-O, as long as they’re constant! If you were solving this for a HW or exam and put down 3 and 9 instead of 6 and 7, no problem!)
So let’s figure out the Big-O. Step 2 is to get the pattern for the \(i\)th step:
\[\begin{align*} T(n) &= 6 + T\left(\frac{n}{2}\right) \\ &= 6 + \underbrace{6 + T\left(\frac{n}{4}\right)}_{T\left(\frac{n}{2}\right)} \\ &= 6 + 6 + \underbrace{6 + T\left(\frac{n}{8}\right)}_{T\left(\frac{n}{4}\right)} \\ &= 6i + T\left(\frac{n}{2^i}\right)\\ \end{align*}\]
The next step is to determine how big \(i\) has to be to hit the base case:
\[\begin{align*} \frac{n}{2^i} & = 1 \\ 2^i & = n \\ i & = \log_2 n \end{align*}\]
Finally, we substitute our discovered value of \(i\) into the formula.
\[ T(n) = 6(\log_2 n) + T\left(\frac{n}{2^{\log_2 n}}\right) \\ \]
Notice that \(2^{\log_2 n}\) is \(n\) by the definition of logarithms, so we can now solve to get the non-recursive total cost:
\[\begin{align*} T(n) & =6(\log_2 n) + T\left(\frac{n}{2^{\log_2 n}}\right) \\ &=6\lg n + T(n/n) \\ &= 6\lg n + T(1) \\ &= 6\lg n + 7 \end{align*}\]
which of course is \(O(\log n)\) as we expected.
(* From now on, I’m going to write \(\lg n\) instead of \(\log_2 n\), because I’m lazy. You should feel free to do the same.)
Hopefully, before the last line, it was clear to you at a minimum that this was sublinear, that is, faster than linear time as we identified before.