SI 204 Spring 2017 / Notes


This is the archived website of SI 204 from the Spring 2017 semester. Feel free to browse around; you may also find more recent offerings at my teaching page.

Unit 6: Arrays

Several weeks ago, the addition of loops to our programs had a huge impact on the power we wielded as programmers. Before loops, the computer never performed more steps than the poor programmer typed in. With loops, a short, simple program could make the computer do an almost limitless amount of work!

In this unit, we are introducing another game-changing idea: arrays. Until now, a program couldn’t store — couldn’t remember — more data than the number of variables that the poor programmer typed in. Arrays shatter that limitation: an array is a single variable with which a program can store and retrieve a nearly limitless amount of data. With loops and arrays you can, very literally, harness the full potential of the computer.

(If you take our Theory of Computing class, you’ll learn how there is actually a mathematically rigorous way to prove that, once you have functions and arrays, you can essentially write any possible program.)

1 Stack-based arrays

C has a number of built-in types such as int, double, and char. These each represent a single piece of information, with a fixed number of bytes (4 bytes, 8 bytes, or 1 byte respectively for int, double, and char).

Arrays are about storing a whole bunch of objects with the same type, right next to each other in memory, and using just a single variable to refer to all of them.

The first kind of arrays we’ll learn about are stack-based arrays, which are declared with a line like

int myarrayname[14];

which declares an array called myarrayname that contains 14 ints.

To access each element, you write something like

printf("Here it is: %i\n", myarrayname[3]);

The myarrayname[3] says “give me the int from myarrayname at index 3”. Importantly, array indexes start from zero, so index 3 is actually the fourth element from the beginning of the array. You can use these as l-values or r-values; in other words, you can assign an array element with a line such as

myarrayname[7] = 7128;

1.1 Motivating example: The Price is Right

For example, imagine we are implementing the first part of the TV game show “The Price is Right”. In this, 4 contestants each guess the value of some item like a refrigerator, and then the true price is revealed, and the closest guess without going over is the winner.

Here’s how you might program that logic:

printf("Enter 4 guesses: ");
fflush(stdout);
double guess1;
double guess2;
double guess3;
double guess4;
scanf(" %lg %lg %lg %lg", &guess1, &guess2, &guess3, &guess4);

printf("True price: ");
fflush(stdout);
double actual;
scanf(" %lg", &actual);

double diff1 = actual - guess1;
double diff2 = actual - guess2;
double diff3 = actual - guess3;
double diff4 = actual - guess4;

if (diff1 >= 0 && diff1 <= diff2 && diff1 <= diff3 && diff1 <= diff4) {
  printf("Contestant 1 wins\n");
} else if (diff2 >= 0 && diff2 <= diff3 && diff2 <= diff4) {
  printf("Contestant 2 wins\n");
} else if (diff3 >= 0 && diff3 <= diff4) {
  printf("Contestant 3 wins\n");
} else if (diff4 >= 0) {
  printf("Contestant 4 wins\n");
} else {
  printf("Everyone guessed over!\n");
}

I hope you’ll agree that’s just ugly. The solution is not very scalable or flexible if the number of contestants changed, it requires a lot of repeated logic and typing, and there are way too many opportunities to mess something up, like mistyping a 3 for a 4 in one spot.

Now here’s that same program, using instead an array of 4 doubles to store the 4 guesses:

printf("Enter 4 guesses: ");
fflush(stdout);
double guess[4]; // single variable for all 4 guesses
scanf(" %lg %lg %lg %lg", &guess[0], &guess[1], &guess[2], &guess[3]);

printf("True price: ");
fflush(stdout);
double actual;
scanf(" %lg", &actual);

double diff[4];
diff[0] = actual - guess[0];
diff[1] = actual - guess[1];
diff[2] = actual - guess[2];
diff[3] = actual - guess[3];

if (diff[0] >= 0 && diff[0] <= diff[1] && diff[0] <= diff[2] && diff[0] <= diff[3]) {
  printf("Contestant 1 wins\n");
} else if (diff[1] >= 0 && diff[1] <= diff[2] && diff[1] <= diff[3]) {
  printf("Contestant 2 wins\n");
} else if (diff[2] >= 0 && diff[2] <= diff[3]) {
  printf("Contestant 3 wins\n");
} else if (diff[3] >= 0) {
  printf("Contestant 4 wins\n");
} else {
  printf("Everyone guessed over!\n");
}

Before we critique this version, let’s observe what we see:

  • Two array declarations double guess[4] and double diff[4]. They are both size-4 arrays of doubles.
  • In a size-4 array, the valid array indexes are 0 up to 3.
  • You can take the address of an array element, like &guess[0] in the scanf above. This works the way it should because the index operator [] has higher precedence than the address-of operator &.

So great, the above code uses two arrays guess and diff, rather than declaring individual variables. That saves a little bit, but doesn’t really simplify the code at all or make it any less likely to be buggy.

For that, we have to harness the real power of arrays: the index can be a variable! This means that we can write loops (or functions) to go through the elements of an array, and write a vastly superior version:

int contestants = 4;
printf("Enter %i guesses: ", contestants);
fflush(stdout);

// read in guesses
double guess[contestants];
for (int i=0; i < contestants; ++i) {
  scanf(" %lg", &guess[i]);
}

printf("True price: ");
fflush(stdout);
double actual;
scanf(" %lg", &actual);

// find the closest guess
double closest = -1;
int winner = -1;
for (int i=0; i < contestants; ++i) {
  if (guess[i] <= actual && guess[i] > closest) {
    closest = guess[i];
    winner = i + 1; // add one because indexes start at 0
  }
}

// print who won
if (winner < 0) {
  printf("Everyone guessed over!\n");
} else {
  printf("Contestant %i wins\n", winner);
}

Make sure you take a moment to understand this solution! Why is it better? For one, it’s easily scalable and adaptable: changing the number of contestants, just means changing the variable declaration on the first line. But beyond that, this solution is also much easier to debug and less likely to contain hidden errors, because it has just simple loops rather than complicated if/else cases.

Here is a complete working program if you’d like to play around with it.

1.2 Declaring and initializing at the same time

By default, when you just declare an array by itself, the data inside that array is uninitialized, just like declaring (but not assigning) a variable:

int x;
printf("%i\n", x); // x is uninitialized; could be anything!
int a[10];
printf("%i\n", a[3]); // a[3] is uninitialized too

Declaring an array like this is perfectly fine as long as you know that you will assign each data element of the array before you use it, so the problem in the code above is really in the printfs and not in the declarations.

But there is also a syntax in C to declare and assign an array all in one step:

int a[3] = {10, 20, 30};
printf("%i\n", a[0]); // prints 10
printf("%i\n", a[1]); // prints 20
printf("%i\n", a[2]); // prints 30

As you can see in the example above, to declare and assign an array all at once, you enclose the initial values in curly braces, separated by commas. That’s called an initializer list.

Interestingly, the number of elements in the initializer list does not have to match up exactly with the size of the array; if it’s smaller, any extra array elements are assigned to zero. For example:

double a[4] = {3,4,5,6};
printf("%g %g\n", a[0], a[3]); // prints 3 6
double b[4] = {3};
printf("%g %g\n", b[0], b[3]); // prints 3 0
double c[4];
c[0] = 3;
printf("%g %g\n", c[0], c[3]); // prints 3 and then ANYTHING
// c[3] is uninitialized!

Unfortunately, this kind of assignment only works with stack-based arrays (not with heap-based arrays which you’ll read about later), and it only works when you declare the array. If you want to change what’s in the array later, you have to just write multiple assignments for the individual elements.

1.3 How it works: pointers and memory

Understanding what’s actually happening with arrays is crucial to being able to use them effectively, so let’s look “under the hood” at what happens in this unbelievably simple program:

int arr[10];
arr[3] = 8;
printf("%i\n", arr[3] * 2);

In the initial array declaration, the program gets a “block” of 10 ints, all one after another in memory, alongside any other local variables in the same scope. Remember that each int is 4 bytes, so this array actually takes up 40 bytes total.

So what exactly is arr itself? We know that arr is an array of 10 ints, but what does that really mean? The most important thing to remember is this: an array is represented by a pointer to its first element. There is some distinction between an array of ints and a pointer to an int, but for most purposes you can think of an array as actually being a pointer.

What this means first of all is that dereferencing an array variable will give you the first element in the array, i.e., the element at index zero:

int arr[10];
arr[0] = 3;
printf("%i\n", arr[0]); // prints 3
printf("%i\n", *arr); // prints 3 again!

To be clear, it’s better and clearer to write arr[0], but this tells you something about how arrays really work.

You can also use pointer arithmetic to access other elements in the array. The rule is, if you add or subtract from a pointer, it moves forward or backwards the corresponding number of spots in array, and returns that corresponding pointer. So, for example, arr + 5 is exactly the same as &arr[5], a pointer to the 6th element in the array.

In fact, array indexing is just a shortcut for pointer arithmetic and dereferencing:

int arr[10];
arr[5] = 123;     // this is how you normally do it
*(arr + 5) = 123; // equivalent to the previous, but uglier

When you think of indexing this way, it also explains why array indexes start at 0.

http://xkcd.com/163/

2 Strings are arrays

Recall that we’re been using the following typedef so far for our cstring type:

typedef char cstring[128];

That means that a cstring declaration like

cstring mystr;

is exactly equivalent to

char mystr[128];

And you now know that this means strings are really just arrays of chars of length 128. Why 128? No really good reason, just that everything we’ve wanted to do with strings so far didn’t require them to be much longer than, say, a single line of text.

2.1 A Simple Problem Made Difficult

Let’s start with a simple problem: Write a program that reads firstname lastname, and prints out lastname, firstname. For example, a typical run of the program might look like:

roche@ubuntu$ ./prog
Mickey Mouse
Mouse, Mickey

This should be no problem; something like this will do:

char first[128];
char last[128];
scanf(" %s %s", first, last);
printf("%s, %s\n", last, first);

Notice that we used the actual type for our strings first and last so there would be no need for the cstring typedef here.

Now let’s make things more difficult: Suppose I also want to capitalize all the letters in the names. No matter how hard you work with what we’ve learned in C before this unit, there’s no way to write this program! You could capitalize, you could switch first and last names, you just couldn’t do both together. The problem is that you need to access the characters within the strings first and last, and you need to know how many characters are in the strings.

The “how many characters are in the strings” part is easy: we already know that there’s a built-in function strlen that’s part of the [strings.h library][stringsh]. So in the previous example, calling strlen(first) would return the number of letters in "Mickey", which is 6.

As you might suspect, it is also possible to reference characters within a string - not by specific names, but by indices. So the initial character of the string first, for example, has index 0. To reference it, say for printing, you write first[0], which we usually read as “first sub zero”. Characters within a string are thus indexed from zero up to one less than the length of the string, like this:

--- --- --- --- --- ---
 M   i   c   k   e   y
 0   1   2   3   4   5
--- --- --- --- --- ---

Using indices, we can make sure that every character within the string first is printed in capitals:

for (int i=0; i < strlen(first); ++i) {
  if ('a' <= first[i] && first[i] <= 'z') {
    printf("%c", first[i] - ('a' - 'A'));
  } else {
    printf("%c", first[i]);
  }
}

In fact, it might be kind of nice to make a function to do the printing in capitals for us, and produce a program like this.

2.2 Length and null bytes

There’s something surprising in the example above: the string "Mickey" has length 6, but you can see that we declared the array first to have size 128. So which is it?

The answer is both! The size of the string array is 128 chars, but this is actually able to store any string with length from 0 up to 127. The trick is that a special character '\0' is used to indicate the end of the array. This is called a “null” character or “null byte”, and it has ASCII value 0.

That means that the actual way "Mickey" is stored in the example above is the six letters of the string, followed by a null byte:

--- --- --- --- --- --- ---- --- --- ---
 M   i   c   k   e   y   \0   ?   ?  ...
 0   1   2   3   4   5    6   7   8  ...
--- --- --- --- --- --- ---- --- --- ---

So in any string, the length of the string is really the location of the first null byte. Based on this, we can imagine that the strlen function (which is built-in to the <string.h> standard library) might be written like:

int strlen(char* str) {
  int i=0;
  while (str[i] != '\0') {
    ++i;
  }
  return i;
}

We’ll get more into the syntax of this function definition later. For now, just notice that the while loop is searching for the first null byte, and then it returns that position as the length of the string.

Take a moment to think about what would happen in that loop if a string did not contain any null byte. The while loop would just keep going past the array into some other arbitrary memory, possibly causing a dreaded segfault! So remember kids, always put null bytes at the end of your strings or else the program has no way of knowing where the string ends and some other data begins.

2.3 Declaring strings

Since strings are arrays, we can declare and assign them at the same time just like you learned above in any other stack-based array. Conveniently, C also allows you to declare and assign array using the double-quote syntax that you’re used to.

Along with the strcpy function provided by <string.h> that you know about already, we really have four different ways to assign strings. Here’s a demonstration of all four.

One character at a time:

char s[128];
s[0] = 'c';
s[1] = 'o';
s[2] = 'o';
s[3] = 'l';
s[4] = '\0';
printf("%s\n", s);

Using regular array syntax:

char s[128] = {'c', 'o', 'o', 'l', '\0'};
printf("%s\n", s);

Using double-quote string syntax (the most convenient!!):

char s[128] = "cool";
printf("%s\n", s);

Using strcpy:

char s[128];
strcpy(s, "cool");
printf("%s\n", s);

3 Heap-based arrays

As you’ll learn in much more detail if you take later courses such as Systems Programming, the memory assigned to a running program is (mostly) assigned into two parts: the stack and the heap. The stack is much smaller (just a few kilobytes generally), and it’s where all regular variables — including the stack-based arrays you’ve just learned about — are stored.

But this limitation in the size of the stack means that it’s frequently necessary to declare arrays in the heap space of your program. This has much greater power and flexibility, but (as you might expect) also entails more responsibility from the programmer.

3.1 Allocation

To allocate a heap-based array, you will use the calloc function, which is a built-in function from the <stdlib.h> library.

The way calloc works is that you give it two integers: the number of elements in your desired array, and the size (in bytes) of each array element. The calloc function then finds that amount of space in heap memory, clears it all to zeros, and returns a pointer to the beginning of the array. So you can use it like so:

int* arr = calloc(10, sizeof(int));
// now arr is an int array of size 10
arr[3] = 13; // works
printf("%i\n", arr[1]); // prints 0, since calloc clears the memory
printf("%i\n", arr[3]); // prints 13 of course!

Something new you should notice in the code above is the sizeof operator. Yes, even though this looks like a function, it’s actually an operator built in to the C language. You give it the name of a type, and it returns the number of bytes that an object of that type takes up.

So, for example, sizeof(int) is going to equal 4 on any modern computer, and in fact the code above could have just used 4 in place of calling sizeof(int). But calling sizeof is much better: it makes it more clear what that value means, and if you found yourself on an unusual computer with different-size integers, well your code using sizeof(int) would still compile and run with no modifications.

(Note, calloc also has an older brother named malloc that does the same thing, but with only a single argument for the number of bytes, and without clearing the memory to all zeros. This makes malloc a little bit faster, but also more error prone, so I recommend you use calloc unless you have a good reason not to.)

3.2 Deallocation

The most important difference between stack-based arrays and heap-based arrays is that of scope. Stack-based arrays go away when the scope they were declared in ends, just like normal variables that you are used to.

But heap-based arrays are different; they live on forever (or at least until your program terminates). This can be a useful feature at times, but can also lead to memory leaks. That’s where your program allocates more and more memory, without ever giving it back, so that after running your program for a long time it eats up all the memory in your computer and crashes.

Fortunately, we can give calloc-allocated arrays back to the operating system for recycling using the free function. You just pass free the pointer to the beginning of the array you want to deallocate, like so:

int len;
printf("How long is your name? ");
fflush(stdout);
scanf(" %i", &len);

// note: len+1 to account for the null byte!
char* name = calloc(len+1, sizeof(char);
printf("Enter your name: ");
fflush(stdout);
scanf(" %s", name);

// now you might do something with the name...

// deallocate the space for name
free(name);

Think of calloc and free the same as fopen and fclose: every call to calloc to create an array should have a corresponding call to free that happens later to deallocate it.

3.3 Growing an array

Besides living forever and potentially being larger than stack-based arrays, heap-based arrays also have the advantage in that you can resize them after declaration.

For example, let’s say we want to read in a bunch of prices (doubles), but we don’t know how many there will be. How will you decide how large to make the array?

The answer is, you just make the array bigger when you need to, by callocing a new one, copying over the data, and then freeing the old array. Here’s a complete example:

#include <stdio.h>
#include <stdlib.h>

int main() {
  int size = 5; // initial size of the array
  double* arr = calloc(size, sizeof(double));

  int num = 0; // number of prices stored in the array
  double temp;
  printf("Enter prices, followed by a negative number:\n");

  scanf(" %lg", &temp);
  while (temp >= 0) {
    if (num == size) {
      // array is full; must resize
      int oldsize = size;
      // increase the size
      size += 5;
      // declare new array with that size
      double* newarr = calloc(size, sizeof(double));
      // copy the old data to the new array
      for (int i=0; i<oldsize; ++i) {
        newarr[i] = arr[i];
      }
      // deallocate the old array
      free(arr);
      // copy the new array pointer
      arr = newarr;
      printf("(array size increased from %i to %i)\n", oldsize, size);
    }

    // copy the price that was read to the next slot in the array
    arr[num] = temp;
    ++num;

    // read the next value
    scanf(" %lg", &temp);
  }

  printf("You entered %i prices.\n", num);
  printf("The middle price was %g.\n", arr[num/2]);

  free(arr);
  return 0;
}

Since this kind of “grow an array” operation is so common, the <stdlib.h> library also provides a convenient realloc function that does the reallocation and copying all for you. The arguments to realloc are the old pointer and the new total size in bytes, and the return is the pointer to the newly-increased array.

So in the previous example, all of this:

int oldsize = size;
// increase the size
size += 5;
// declare new array with that size
double* newarr = calloc(size, sizeof(double));
// copy the old data to the new array
for (int i=0; i<oldsize; ++i) {
  newarr[i] = arr[i];
}
// deallocate the old array
free(arr);
// copy the new array pointer
arr = newarr;

could be replaced by just

size += 5;
arr = realloc(arr, size * sizeof(double));

Just be careful that you notice that (unlike calloc) the size of the array has to be explicitly multiplied by the sizeof(TYPE) to get your total size in bytes. Also unlike calloc, while the old contents are copied over to the beginning of the new array, the new part is not zeroed out when you call realloc.

4 Arrays & Functions

There are many situations in which it’s natural to ask for functions that have arrays as arguments or that return arrays … or both! Technically speaking, there’s nothing really new to teach here, i.e., there’s no special syntax or information that you don’t already know. Arrays are pointers, so functions that deal with arrays take and/or return pointers.

But actually writing and using functions with arrays can be tricky, so we’ll go through some examples to see how it works.

4.1 Arrays as arguments to functions

Suppose we have an array of game scores, each being either a positive number for how many points you won by, or a negative number for how many you lost by. For example, the Philadelphia Eagles’ 2016 season might be stored as

int games = 16;
int eagles[games] = {19, 15, 31, -1, -7, 11, -6, -5,
                     9, -11, -14, -18, -5, -1, 5, 14};

(Not their best season.)

Now we may want to compute some information, such as the number of wins they had in the season. For that, we can write a function countpos that takes in an array and returns the number of positive integers in the array.

What should the prototype for countpos be? You might be tempted to say

int countpos(int scores[16]);

and that would work in this case, but it would be pretty inflexible. This function will only work if the array you pass in has length exactly 16, and it will only work for stack-based arrays.

So instead, we would rather pass in a pointer (to the first element in the array), as well as the length of the array as a second argument. It’s easy to forget this, so I’ll say it again: you usually need to pass the array as well as its length to a function, since you can’t tell the length of an array based on its pointer value.

Once we know what the prototype should be, writing the actual function is similar to what we’ve already been doing:

int countpos(int* scores, int length) {
  int count = 0;
  for (int i=0; i < length; ++i) {
    if (scores[i] > 0) {
      ++count;
    }
  }
  return count;
}

(Challenge: How would you write this function using recursion?)

4.2 Modifying arrays in functions

Remember from the previous unit that, if you want a function to modify a normal variable, you have to pass a pointer to that variable, since otherwise the function would in fact be modifying a copy.

What about when you want a function to modify the elements of an array? You still just pass a regular pointer! Since we pass arrays to functions by passing a pointer, only the pointer is copied; it still points to the original array. Therefore modifying the array elements in the function will actually modify the original array that was passed in.

For example, here is a function that replaces the contents of a string (that is, an array of chars) with k copies of the character c, followed by a null byte of course:

void kcopies(char* str, int k, char c) {
  for (int i=0; i < k; ++i) {
    str[i] = c;
  }
  str[k] = '\0';
}

You could use this in a main like:

int main() {
  char line[128];

  kcopies(line, 10, '*');
  printf("%s\n", line); // prints 10 *s

  kcopies(line, 7, '!');
  printf("%s\n", line); // prints 7 !s

  return 0;
}

Really there is only a single char array in this entire program; every time kcopies is called it gets a pointer to the same spot in memory.

4.3 Creating arrays in functions

Make sure you understand how the parameter passing works when we pass an array to a function, as in the examples above. The function gets a copy of the pointer to the original array, and so is able to modify the original contents.

But what about when we want to create an entirely new array and return it from a function? The typical thing to do here is have the function return a pointer to the new array.

For example, here’s a function that takes a strings and makes a sentence out of it by capitalizing the first word and adding an exclamation point at the end:

char* makesentence(char* original) {
  // first compute the new char array's length
  // add 1 for the exclamation point and 1 for the null byte
  int reslength = strlen(original) + 2;

  // allocate space for the new string
  char* result = calloc(reslength, sizeof(char));

  // capitalize the first letter
  result[0] = original[0] - ('a' - 'A');

  // copy the remaining letters from the original string
  int i = 1;
  while (original[i] != '\0') {
    result[i] = original[i];
    ++i;
  }

  // add the exclamation point and null byte
  result[i] = '!';
  result[i+1] = '\0';

  // return the new string
  return result;
}

That could be used in a main like this:

int main() {
  char word[128];
  printf("Enter a lowercase word: ");
  fflush(stdout);
  scanf(" %s", word);

  char* cap = makesentence(word);
  printf("%s\n", cap);

  free(cap);
  return 0;
}

Notice that the array must be a heap-based array declared with calloc or malloc! Otherwise, it would go out of scope when the function returned, which is not what we want here. The flip side is that the allocated array must be freed later on by whoever called the function — you can see this at the end of the main above.

4.4 Common functions on arrays

Some kinds of functions are so common that we’ll see the same pattern show up over and over again, so let’s take a few moments to look at these common array problems and their solutions.

For all these, you can start with the following program:

#include <stdio.h>
#include <stdlib.h>

// FUNCTION PROTOTYPE(S) HERE

int main() {
  // read size n
  int n;
  do {
    printf("Enter size: ");
    fflush(stdout);
  } while(scanf(" %i", &n) != 1 || n <= 0);

  // allocate array and read in contents
  int* data = calloc(n, sizeof(int));
  printf("Enter %i integers, space separated.\n", n);
  for (int i=0; i<n; ++i) {
    scanf(" %i", &data[i]);
  }

  // CALL YOUR FUNCTION(S) HERE
  // AND PRINT OUT THE RESULTS

  // clean-up time
  free(data);
  return 0;
}

// FUNCTION DEFINITION(S) HERE

You come up with the prototypes, the definitions, and how to call them in your main to get the problem solved!

  1. Compute the average of the \(n\) numbers, which is their sum divided by \(n\). Be sure to do double division and not integer division!

    Sample solution.

  2. Find the minimum and maximum values in the array.

    Note, to do this with a single function call, it’s a little challenging because you need to return two numbers from the function. Look back to the notes from last unit if you don’t remember how do do this. (Hint: pointers!)

    Sample solution.

  3. Print out the “partial differences” of consecutive elements in the array. Here’s an example of a sequence of four numbers along with the sequence of partial differences (note that there are only three partial differences).

    3  7   8   5
    \_/ \_/ \_/
     2   1  -3

    Sample solution.

  4. Create (and return a pointer to) only the positive values in the given array. The print out those positive values.

    Note, the array should be created to have exactly the correct size, and don’t forget to free it in your main to avoid a memory leak!

    Sample solution.

5 Nested arrays

5.1 Arrays of any type

We can store any type of object we like in an array. We’ve seen ints, doubles, and chars so far, but it really could be any type.

For example, imagine a situation where we’d like a program that reads a list of file names from the user, creates those files, and then takes a sentence typed in and writes it to each of those files. Can we do this?

What we’ll need is an array of FILE* objects. The type of this array would be a “pointer to a FILE*”, which would be written as FILE** (notice the two asterisks).

Remember, we create an array of type T and size N like this:

T* array = calloc(N, sizeof(T));

So we can make an array of FILE* objects like:

FILE** array = calloc(N, sizeof(FILE*));

Notice that we used sizeof(FILE*) in creating the array of FILE* objects. That is going to be the size of a pointer, which in the modern age means 8 bytes (64 bits).

With that in mind, here’s how the program might look:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
  // get number of files
  int numfiles;
  printf("How many files? ");
  fflush(stdout);
  scanf(" %i", &numfiles);

  // create array of FILE* objects
  FILE** files = calloc(numfiles, sizeof(FILE*));

  // read in filenames and open files
  printf("Enter %i filenames to write to.\n", numfiles);
  for (int i=0; i < numfiles; ++i) {
    char filename[128];
    scanf(" %s", filename);
    files[i] = fopen(filename, "w");
  }

  // read in words and write them to each file
  printf("Enter words to write, followed by a semicolon ';'\n");
  char word[128];
  scanf(" %s", word);
  while (strcmp(word, ";") != 0) {
    // loop through each file and write the next word on a line
    for (int i=0; i < numfiles; ++i) {
      fprintf(files[i], "%s\n", word);
    }
    // read the next word
    scanf(" %s", word);
  }

  // close all the files
  for (int i=0; i < numfiles; ++i) {
    fclose(files[i]);
  }
  // deallocate the array
  free(files);

  return 0;
}

Another important thing to notice is how the “clean-up” is done at the end. We have to have a loop to call fclose on each file first, and then afterwards we can de-allocate the array of FILE*s. (If you called free to deallocate the array first, the information needed for all the fcloses would be lost!)

5.2 Multi-dimensional Arrays

Since we can have arrays of any type of object, why not an array of arrays? For example, suppose we have a class of students whose grades are stored in a file like this. I might want to read in the data and store it, so that I can then answer questions for the user - questions like how did student 6 do on homework 3?

In this case, I’d clearly like an array of 8 objects, each one representing all 10 homework grades for that student. But what type of object can I use to store the 10 homework grades that correspond to a given student? An array of 10 ints, of course! So, each object in my array of students is itself an array of ints, i.e. each object is an int*.

This is called a two-dimensional array because it has two levels of arrays:

  • The outer array has type int** and has size 8 (for the number of students). Each element of the outer array is…
  • An inner array with type int* and size 10, containing the grades for a single student.

Notice that there will be just 1 outer array, but multiple inner arrays. We can allocate space for the outer array with a line like

int** grades = calloc(8, sizeof(int*));

and then each inner array must be separately allocated in a loop like

for (int i=0; i < 8; ++i) {
  grades[i] = calloc(10, sizeof(int));
}

Notice that the 2D array grades has type int**, so each grades[i] has type int*, which helps explain how the loop of inner array allocations works.

Once the 2D array has been created, accessing an element on row i and column j can be done with a double indexing like

grades[i][j] = 95;

Remember, this is really just a bunch of arrays stuffed inside one outer array, which explains how the double indexing works: grades has type int**, the outer array, so grades[i] has type int*, a single inner array, and grades[i][j], or equivalently (grades[i])[j], has type int and represents a single grade entry.

Here’s a working solution.

As we’ve mentioned before, arrays allocated with calloc live on until the end of your program, or until deleted with the free command. With multi-dimensional arrays, you have to remember to delete each array you created. That means, if we refer back to the grades problem from above, that all of the arrays pointed to by the elements of the array grades must be deleted before we can delete the array grades itself.

for (int i=0; i < students; ++i) {
  free(grades[i]);
}
free(grades);

6 Problems

  1. histograms (arrays of counters)

  2. masks (i.e. an array of true/false), for example to implement a simple hangman game.

  3. standard deviation. The sample file numbers.txt has a standard deviation of 14.5756.

  4. Print in reverse (non-recursive)

  5. Print in reverse (recursive!)

  6. palindromes

  7. A predicate that takes a string s and a char c, and tests whether or not c appears in s.

  8. Print out the index of the minimum element in the array. So, for example:

    roche@ubuntu$ ./min
    4
    34 12 8 29
    minimum element is A[2] = 8.

    solution

  9. Print elements of the array, separated by commas. So, for example:

    roche@ubuntu$ ./commasep
    4
    34 12 8 29
    A = [34, 12, 8, 29]

    solution

  10. Write a program to compute dot-products of vectors. If \[v = [a_1, a_2, \ldots, a_m],\qquad w = [b_1, b_2, \ldots, b_m]\] are two vectors of dimension \(m\), the dot product of v and w is \[v \cdot w = a_1b_1 + a_2b_2 + \cdots + a_mb_m.\] Your program will get a dimension m from the user, read in two vectors of length m, and print out their dot product. Here’s a sample run:

    roche@ubuntu$ ./dotprod
    Enter dimension: 4
    Enter two vectors:
    [3,-1,0,2]
    [5,0,9,-7]
    Dot product = 1

    Try it for yourself and then take a look at my solution.

  11. Write a program that reads a list of banned words from a file, stores them in an array, and then simply reads words from the user and returns “banned” or “not banned” until the word “end” is encountered. The file starts with a number, which is the number of banned words, and then the words themselves are listed. The file banned.txt is a good example. Here’s my solution.

  12. Picking random teams. The file names.txt contains a list of names (the number of names is given on the first line of the file). Read those names in, and get from the user a number of teams to be made from those names. The number of teams must evenly divide the number of names. The program should randomly assign names to teams, and display the result. Here is a sample run of the program.

    roche@ubuntu$ ./pickteams
    There are 24 people.
    How many teams would you like? (make it evenly divide n) 3
    Team 1: Mike Dan Chris Joni Christy Seung-Geol Cathy Susan
    Team 2: Gavin Nate Paul Adina Jeff Carl Karen Eric
    Team 3: Phong Betty Madeline Marianne Don Shirley Tim Steve

    Here is one way to solve it.

  13. tic-tac-toe