SI 204 Spring 2017 / Notes


This is the archived website of SI 204 from the Spring 2017 semester. Feel free to browse around; you may also find more recent offerings at my teaching page.

Unit 10: C++

As the final unit of the semester, we will learn how to translate the concepts we learned in C to a new (but closely related) programming language called C++. Seeing some different ways to code and exposure to different languages is important to being a well-rounded programmer, and this unit will also serve as a good review of all the concepts we have learned this semester.

1 C++ Basics

1.1 History

In the late 1970s, a new and exciting idea was making its way through the world of computer science called object-oriented programming. You know that a struct in C gets to put together a bunch of variables and combine them in a nice little package. Well, the main idea of object-oriented programming is that you also put functions in with these little bundles of data, and call them objects or classes.

Bjarne Stroustrup created C++ originally as “C with classes”, to add on objects to the C language which was already pretty widely used. Eventually this grew and grew into what we now know as C++ (get the joke?). Part of the object-oriented impetus for C++ means that there is a much stronger emphasis on types, and consequently more interesting things we can do with types, in C++. The standard library for C++ is also more expansive than the one for C, meaning there are more “built-in” things that let you write bigger programs more easily.

There’s an entire class on object-oriented programming at USNA, and you’ll have plenty of time to learn more about that aspect later. For now, we’ll just see how C++ differs from C for the kinds of programs that we’ve written so far, and how the increased emphasis on types in C++ changes the way we might think about programming.

1.2 C is (mostly) a subset of C++

Originally at least, C++ was designed as a strict extension of the C language. That means that (for the most part) any C program is also a C++ program!

However, partly to protect against potential divergence between the two languages, officially C++ has different standard library header files than C. For all the C libraries you have used, you just add the letter c to the front of the name and drop the .h. So #include <stdio.h> becomes #include <cstdio>, <string.h> becomes <cstring>, and so on.

One other aspect is that C++ has namespaces to help you avoid, say, accidentally naming one of your variables with the same name as a built-in function. But since our programs are still pretty small, we’ll just avoid doing that and write using namespace std; after all the #includes to indicate that we can use built-in standard functions with their usual names.

With all that, here is a totally non-exciting C++ program:

#include <cstdio>
#include <cstring>
using namespace std;

int main() {
  char name[128];
  printf("What is your name? ");
  fflush(stdout);
  scanf(" %s", name);

  printf("Hi, %s!\n", name);

  return 0;
}

1.3 Naming and compiling C++ programs

Source code for C programs usually goes in a file with a .c extension such as myprogram.c. To help remind you (and your text editor!) that C++ programs are different, we’ll use the .cpp filename extension for C++ programs, like myprog.cpp. (Another popular choice that you might see sometimes is .cc.)

Since C++ is a different language, it also needs a different compiler. Fortunately, the same folks that make gcc also make a C++ compiler called g++ that should be installed on your VMs as well as the lab machine.

For example, if you saved the program above to a file called firstprog.cpp, you could compile and run it with:

roche@ubuntu$ g++ firstprog.cpp -o firstprog
roche@ubuntu$ ./firstprog
What is your name? Dan
Hi, Dan!

2 Bool

In C, we use 0 to mean “false” and any nonzero number to mean “true” in things like if statements and while loops. This works, but can be kind of unclear sometimes, like if a function returns an int - will that be a 0/1 int for true/false, or will it be a count or some other kind of actual number?

To help you out with this, C++ introduces a new basic type called bool that can be either true or false, and can be used for things like the conditions of if statements or while loops. bool is also what gets returned by comparison operators such as == and boolean operators such as &&.

For example:

bool stop = false;
while (! stop) {
  char word[128];
  scanf(" %s", word);
  if (strcmp(word, "END") == 0) {
    stop = true;
  }
  printf("The next word is %s\n", word);
}

Interestingly, a bool in C++ is really still stored as a 0 or a 1, and you can see that if you try to print one out:

printf("%i\n", true); // prints 1
printf("%i\n", false); // prints 0

3 Functions

Functions are probably the most important programming construct in C, and they’re very important in C++ too! As you may recall, one of the main reasons for the C++ language was to add functions to structs (as we will discuss below).

In order to make that possible, C++ lets you do a few things with functions that we couldn’t do in C. By themselves, these features are mostly about convenience and writing cooler “looking” programs. But together, they make the C++ language much more powerful because due to the richer use of types.

3.1 Overloading

C won’t let you have two functions with the same name. If you try to do this:

void foo(int x) {
  printf("x is %i\n", x);
}

void foo(int x, int y) {
  printf("x is %i and y is %i\n", x, y);
}

any C compiler will give you an error message on the second function definition, telling you that foo has already been declared. You can’t have two different functions with the same name.

But in C++, you can do this! Function overloading is a feature added to C++ that lets you have multiple functions with the same name. The compiler will figure out “which one you meant” when you call an overloaded function by looking at the number and types of the arguments passed to the function.

So for the example above, if compiled in C++,

foo(13)     // will print "x is 13"
foo(14, 15) // will print "x is 14 and y is 15"

It’s important to emphasize that these functions are completely separate from each other and can do totally different and unrelated things. They just happen to share the same name.

Importantly, since the compiler uses the number and types of arguments to distinguish between different overloaded versions, every overloaded function must have a different number or types of parameters.

Even then, the compiler can still have a hard time deciding what to do. For example, here are a pair of overloaded functions to “raise” a character or an integer:

char raise(char c) {
  if (c >= 'a' && c <= 'z') {
    // convert to uppercase
    return c - ('a' - 'A');
  }
  else {
    return c;
  }
}

int raise(int c) {
  return c + 1;
}

That’s all great, and we can call raise(3) to get back the int value 4 or raise('d') to get back the char value 'D'.

But what happens if we call raise(2.5), passing a double rather than an int or a char? The compiler doesn’t know what to do here, because while it could convert that double to either a char or an int (by truncating), it’s not sure which one you want. You have to either write another overloaded version of the function that takes a double, or do the cast to int or char yourself before calling raise.

3.2 Pass by reference

Remember that one of the reasons in C why you might pass a pointer to a function, is if you want to allow that function to change the value of what you passed into it.

For example, here’s a super simple function to increase the first number’s value by 3:

void add3(int* xptr) {
  *xptr = *xptr + 3;
}

That works, but can be a bit annoying because we have to use the dereference operator * all over the place in our function. Besides that, we have to remember to add the address-of operator wherever we might call this function, like:

int a = 10;
add3(&a);
add3(&a);
// now a equals 16

Since pointers can be tricky and sometimes “dangerous” to use, C++ tries to avoid some of the situations where we might have to use them. The answer in this case is to use a reference variable as the parameter to the add3 function, like so:

void add3(int& x) {
  x = x + 3;
}

Specifying a reference type is similar to specifying a pointer type, except you use an & with the type instead of a *. What that means is that the x passed to the function is a reference to whatever variable the function is called on. Changing x in the function, when x is a reference variable, also changes the original value!

int a = 10;
add3(a); // works now, no pointer needed!
add3(a);
// now a = 16

Pass by reference is also used in C++ to return multiple values from a function (by passing in multiple arguments by reference), or to avoid copying something that is large or otherwise wouldn’t make sense to copy.

4 Memory allocation

In C we used the functions calloc and free from the standard header <stdlib.h> in order to allocate and deallocate memory on the heap.

C++ builds that same functionality into the language itself with two special operators new and delete. These don’t require any special header files, and (conveniently!) new uses the type name automatically to reserve the right amount of space.

Here’s how you would use new to allocate an array of 19 ints:

int* arr = new int[19]; // allocate
delete [] arr;          // deallocate

Of course, the 19 in the program above could be replaced by any positive integer. Just like calloc, the new operator returns a pointer to the newly allocated space. Make sure you don’t miss the empty brackets [] that tell the delete operator that you’re deleting an entire array!

There is also a simpler syntax to allocate just one thing, like when you want to create a new node. Previously we would do something like:

node* temp = calloc(1, sizeof(node)); // old school
free(temp);

but in C++ that looks much simpler:

node* temp = new node; // allocate
delete temp;           // deallocate

Notice that you don’t use any [] brackets in the new or delete statements here, since it’s just a single item and not an array.

5 Structs and classes

5.1 Syntax

A struct definition in C++ looks exactly like it would in C:

struct length {
  int feet;
  double inches;
};

The difference in C++ is, you don’t need the typedef anymore in order to drop the keyword struct. That is, in the previous example, the name length is the type of the struct, and without any typedefs, you can do things like:

length len;
len.feet = 6;
len.inches = 0.9;

C++ also has something similar to a struct called class. This is really the same as a struct except that the fields by default in a class are “private”, meaning they can’t be accessed directly from outside the class:

class mathop {
  double num1;
  char op;
  double num2;
};

mathop a; // works
a.op = '+'; // error: op is "private"

What “private” means here and how such a thing would be useful is a topic for a later class when you learn about object-oriented programming. For now, just remember that classes and structs are equivalent concepts in C++, but with different defaults that make structs easier for us to deal with.

5.2 Operator overloading

And now for something truly cool in C++. Like many modern programming languages, C++ allows you to change how the language itself works so that any new types you create can look and behave just built-in types.

Take the length example from above:

struct length {
  int feet;
  double inches;
};

In C or C++ we can always write some function that adds up two length structs, no problem. But in C++, you can actually make the + operator work for our own types by “overloading the operator”. To do this, we write a function with a special keyword operator, like so:

length operator + (length a, length b) {
  length result;
  result.feet = a.feet + b.feet;
  result.inches = a.inches + b.inches;
  while (result.inches >= 12) {
    result.feet++;
    result.inches -= 12;
  }
  return result;
}

Then in your main or anywhere else, you could really do something like this:

length meter = {3, 3.37};
length avgheight = {5, 10};
length x = meter + avgheight;
// now x.feet == 9 and x.inches == 1.37

The compiler turns that + operation into a call to the function that you wrote. That’s pretty awesome!

Note that you could really write anything inside that function call above — it doesn’t really have to do anything related to addition (although it probably should). All that really matters is that the parameter and return types are correct.

You can similarly define overloaded functions for any other operator in C++. And you can define different versions of each operator for different types, or even different combinations of types. It’s called operator overloading because it’s really two things: (1) treating operators like they’re function calls, and then (2) overloading that function call with different types. Both things that you definitely can’t do in C!

5.3 Functions inside structs (“methods”)

One of the “big deal” ideas with object-oriented programming (the basis for creating C++) is combining data and behavior into a single package in your code. In C++, you do this by adding functions inside a struct or class definition. The standard object-oriented thing is to call these functions “methods” of the object.

Here’s a simple example of a method inside our length struct from before:

struct length {
  int feet;
  double inches;

  int nearest_foot() {
    if (inches < 6) {
      return feet;
    } else {
      return feet + 1;
    }
  }
};

Notice that the method can access the fields of the struct it’s called on. Just like with the fields of a struct, you use the dot operator to call a method, like:

length ht = {6, 0.8};
printf("%i\n", ht.nearest_foot()); // prints 6
ht.inches = 11;
printf("%i\n", ht.nearest_foot()); // prints 7

Something that’s worth pointing out here is that these struct methods don’t really do anything we couldn’t do already, for example by writing an external function that takes the struct as a parameter such as:

int nearest_foot(length len);

So the main benefits to having methods in classes is about organization: keeping all these related pieces of information (the fields like feet and incches) and operations on that information (the methods like nearest_foot) in the same place in code. That kind of organization can be really helpful when you are designing larger and more complicated programs.

(Come back next semester to see a lot more of this and learn all about object-oriented programming!)

6 I/O

We’ve covered all of the main new features of the C++ language that you need to know about. Now we’ll see how those language features really shine and become useful for everyday programming, in how they are incorporated to the C++ standard libraries.

As we have seen so far, you can do I/O in C++ using printf, scanf, and friends by including the <cstdio> library. However, C++ also has an improved library for I/O called the iostream library.

The C <stdio.h> library relies on format strings, where you as the programmer have to explicitly specify (with a format string specifier) the type of every argument.

As we have seen already, C++ has a much stronger support for types, and the compiler is willing and able to do more “work” for you. This is reflected in the C++ <iostream> library, which relies on types and operator overloading in order to do I/O. The result is that you as a programmer don’t (usually) have to specify the type of what’s being printed out, but can instead let the compiler figure that out for you.

6.1 cin and cout

In C-style I/O using <stdio.h> or <cstdio>, the basic I/O type is FILE*, and standard input and output are called stdin and stdout, respectively. Here’s a small example and reminder of how C-style I/O works:

int num;
char let;
printf("Enter a number and a letter.\n");

fscanf(stdin, " %i %c", &num, &let);
// could also use scanf(...) and drop stdin

fprintf(stdout, "You entered %i %c\n", num, let);
// could also use printf(...) and drop stdout

In the C++ <iostream> library, there are two basic I/O types: istream and ostream (for input and output, respectively), and the standard in and out streams are called cin and cout.

To perform I/O using <iostream>, the example above would be written as:

int num;
char let;
cout << "Enter a number and a letter.";
cout << endl;

cin >> num;
cin >> let;

cout << "You entered ";
cout << num;
cout << " ";
cout << let;
cout << endl;

First a comment about this endl thing. It stands for “end of line” and is a shortcut to end the line and flush the buffer. So it’s like printing out a newline "\n" and then calling fflush at once.

Now as you can see, the >> and << operators are used to perform input and output. These operators are actually “shift” operators from C, which are used to move around the bits of numbers. But in C++, they are repurposed to do input and output. Sometimes << is called the stream insertion operator and >> the stream extraction operator.

The important thing to notice is, we didn’t have to specify the type of what was being read or written! The commands to read num and let are exactly the same, just changing the variable. This is different from using scanf, where you would have to put %i for num and %c for let.

What’s actually happening is operator overloading! The C++ <iostream> library provides overloaded versions for the << and >> operators based on all the standard types, where the left-hand side is an istream or ostream object, and the right-hand side is whatever you are reading or writing. This allows the compiler to automatically change the way each thing is read or written based on the type of that variable.

In fact, that overloaded operation also returns the same stream object it was called on. For example, the expression cin >> num calls an overloaded operator, passing in arguments cin and num, and returning cin at the end. That means that we can “chain” these operations together, like:

(cin >> num) >> let;

What that line does is:

  1. Read in num from stream cin
  2. Return cin from the first part of the expression
  3. Read in let from stream cin

In fact, since the >> and << operators are left-associative, you can chain them without using parentheses. Here’s how the above example can be written using I/O operator chaining:

int num;
char let;
cout << "Enter a number and a letter." << endl;

cin >> num >> let;

cout << "You entered " << num << " " << let << endl;

6.2 Reading and writing files

OK, so cin has type istream and is the C++ equivalent of stdin, and cout has type ostream and is the C++ equivalent of stdout.

You should also know that cerr is the C++ equivalent of stderr, used to write to the “standard error” stream.

Of course we can also open our own files for reading and writing. The type of a file input stream is ifstream and to use it you have to #include <fstream> at the beginning of your program. Then you could use it like this:

ifstream fin;           // declare variable of type ifstream
fin.open("myfile.txt"); // open myfile.txt for reading

// read in an int from the file
int x;
fin >> x;

fin.close();

Notice: ifstream is a class, and it has methods open and close (among others). In fact, C++ will automatically call the close() method when the object is destroyed, and you can also declare and open a file all at once, so the above can be even more simply done as:

ifstream fin("myfile.txt");
int x;
fin >> x;

You can do the same thing for printing to a file by using the ofstream class, like so:

ofstream fout("created.txt");
fout << "Writing thirty to created.txt: " << (5*6) << endl;

Something subtle to notice here: fin above has type ifstream, but cin has type istream (no “f”). Yet the two objects are somehow compatible, and can be used in the same way. What gives?

This is part of the object-oriented features in C++ that we won’t get into much in this class. The gist is that C++ class types are hierarchical, meaning that one type can be a “sub-type” of another one. In this example, the type ifstream is a sub-type of istream, so anything of type ifstream also has type istream. Similarly, every ofstream is also type ostream. You’ll learn more about this type hierarchy if you take a future data structures or object-oriented programming class.

6.3 Making I/O work for your own types

Something really cool that we can do with C++ I/O is to tell the compiler how to read in any type, including types that we make up. You do this by writing an overloaded version of the << or >> operator that accepts the type you’re interested in.

For example, here’s how to tell C++ how to read and write lengths in a format like 5' 6.2" to mean “5 feet and 6.2 inches”:

struct length {
  int feet;
  double inches;
};

// tell C++ how to print out a length struct
ostream& operator<< (ostream& out, length len) {
  out << len.feet << "' " << len.inches << '"';
  return out;
}

// tell C++ how to read in a length struct
// notice: len is passed BY REFERENCE so it can be changed!
istream& opeartor>> (istream& in, length& len) {
  char junk; // used to store the ' and " characters
  in >> len.feet >> junk >> len.inches >> junk;
  return in;
}

After providing those definitions, we can read and write lengths just like any other type:

int main() {
  length height;
  cout << "How tall are you? ";
  cin >> height;
  cout << "You are " << height << " tall." << endl;
}

In this way, the types that we create in C++ can really be made to work and act like any of the built-in language types. This makes the C++ language more easily customizable and extendible as a programming language.

7 Standard template library

A big part of what C++ gives us is an extended set of libraries. The <iostream> library we just learned about really provides similar features to the C stdio.h library, but with some extra convenience with C++ types.

The standard template library for containers in C++ is something entirely new. It provides types (really, they’re classes or structs) to store and access data efficiently. We’ll just look at two basic container types, both of which are based on arrays but with extra features and convenience for programmers.

7.1 String

In C, you store a string of letters as an array of chars, type char*. This is called a “C-style string”, and the <string.h> library (called <cstring> in C++) provides a number of convenient functions to deal with char* strings such as strlen, strcmp, and strcpy.

In C++, the header <string> provides a new type called (un-creatively) string. The string class in C++ provides a number of methods and overloaded operators to:

  • read in a string from a stream using the >> operator
  • write out a string to a stream using the << operator
  • create or copy a strings value using the assignment = operator
  • concatenate two strings using the + operator
  • get the length of a string using the .length() method
  • get an individual char by using the index operator [] as if the string were a char array.
  • compare strings alphabetically using the normal comparison operators like <, ==, and >=.

What this all means is that you can treat strings like any other regular type in C++. With C-style strings of type char*, we have to think always about how this differs from the normal C types such as int or double. You don’t have to think about this as much when programming with strings in C++, because the compiler is doing more of that work for you.

A big part of this work is keeping track of the size of the string and automatically re-sizing it as necessary. In C, we usually just chose 128 as the default size of our char arrays, hoping that no one would enter a string longer than 128. We could write more complicated programs that only read in a few characters at a time, then re-allocate arrays as necessary, but we didn’t do that. Fortunately in C++, the developers of the standard library have already done this for you with the string class.

Here’s an example program that reads in and then prints out a name using C++ strings:

#include <iostream>
#include <string>
using namespace std;

int main() {
  string first;
  string last;

  cout << "Enter your name (first and last): ";
  cin >> first >> last;

  string complete = first + " " + last;
  cout << "Your complete name is " << complete << endl;
  cout << "This name has " << complete.length() << " characters." << endl;

  return 0;
}

7.2 A BRIEF note on templates

A very powerful mechanism that was added to C++ around 1990 is that of templates. In short, templates allow you to write generic structs, classes, or functions without saying exactly what all of the types are. Then, based on how that struct or function gets used, the compiler automatically generates copies of that function or struct with the types specified.

There is so much to learn about templates that we could spend an entire course on it! But just to give you the idea, consider a generic linked list node struct:

template <typename T>
struct node {
  T data;
  node* next;
};

First, notice what is the same as always: we have a struct node declaration, with two fields for data and next. The type of next is node* like usual, but the type of data is weird: it’s listed as T. So what is T?

Well, T is declared just before the struct definition begins as a template that can stand for any type such as int or string or anything else. Think of this definition as a recipe for the compiler to follow: “If someone asks for a node, make a struct just like this except replace the type T with the type that they want”.

How you specify the “type that you want” for a template is using angle brackets with the template type name. So for example, to make a linked list of two strings using the template definition above, we might do something like this:

node<string>* L = NULL;
L = new node<string>;
L->data = "first";
L->next = new node<string>;
L->next->data = "second";
L->next->next = NULL;

If on the other hand we wanted a linked list of doubles, we would start with

node<double>* L = NULL;

and go from there. In other words, each of node<int>, node<double>, or node<any_type_here> is its very own type and causes the compiler to generate a new copy of that struct definition above, replacing the type T in the definition with whatever you put in the angle brackets.

7.3 Vector

The C++ standard template library is so called because it has many classes which are used to store data, which are templatized based on the type of data you want to store.

We’ll just look at one of the most useful of these which is the vector class, which you can get by including the header named, unsurprisingly, <vector>.

Think of C++’s <vector> as an array that can do a whole lot more:

  • Automatically resizes itself as items are added or removed
  • You can use the [] operator to look up any element, just like with an array.
  • Better yet, use the at(index) method to look up any element and check for array index out of bounds.
  • The size() method will tell you the size of the vector at any point.
  • The push_back method adds one more element to the end of the vector and is very efficient.
  • Similarly, the pop_back method removes the last element of the vector and re-sizes it accordingly.

Now all of these things you could program yourself in C, by making a struct to hold a pointer (for the array) and an int to keep track of the size, and writing some useful functions there. And that understanding of how vector is implemented will make you a more effective user of it.

But the joy of using vector is that you don’t have to worry about these details every time you want to use an array in your program, which lets you write bigger, more useful programs in less time. Of course there is a tradeoff, and each of the features of vector make it a little bit slower than using regular C arrays in your program. But for most use cases, that’s a tradeoff you’re probably willing to make.

Here’s an exciting example program that reads in numbers until the user hits -1, and then prints them out in reverse order:

#include <iostream>
#include <vector>
using namespace std;

int main () {
  cout << "Enter some numbers followed by a negative number:" << endl;
  vector<int> nums;

  int next;
  cin >> next;
  while (next >= 0) {
    nums.push_back(next);
    cin >> next;
  }

  cout << "You entered " << nums.size() << " numbers." << endl;

  cout << "In reverse order:";
  for (int i = nums.size() - 1; i >= 0; --i) {
    cout << " " << nums.at(i);
  }
  cout << endl;

  return 0;
}

Notice that vector is a templated type, so we use vector<int> to make a vector of ints, but you could do vector<string> for a vector of strings, or even something crazy like vector< vector<double> >.

8 Going further

That’s it for our semester! I hope that you’ve enjoyed this class, that you’ve been challenged, and that you have some new ways of thinking about problem solving with computers.

You should feel great about what you’ve accomplished this semester. With the skills you’ve learned, you can write any program. Given enough time and patience, you could write a program to do absolutely anything that a computer can do.

Of course, you will need much more experience and knowledge to be proficient at that, or to write bigger programs as efficiently as possible, and we have courses where you can learn those things. But you have the power of the computer in your hands, and the freedom to explore and solve new problems is just a text editor away. Good luck, and have fun!