Summary
- Definition of polymorphism
- Ad-hoc polymorphism
- Subtype polymorphism
- Vtables and dynamic dispatch
- Duck typing
What is polymorphism, and how does it relate to types?
Polymorphism (from Greek, “many forms”) is the ability to write a single piece of code (a function, a class) that can operate on values of different types.
In the previous class, we talked about static and dynamic typing, and argued that the main purpose of type systems is to identify type errors, making sure that various operations or functions only operate on objects of the correct type.
Polymorphism is another benefit we get from having types. It lets us write code once, and then have it do the correct thing (which might be different!) depending on the types of objects involved.
In other words, with polymorphism, types don’t just help you find bugs, but actually help you write simpler, cleaner, more flexible code.
Ad-hoc Polymorphism (Function overloading)
A function or operator has different implementations depending on the types of its arguments. The compiler/interpreter chooses the correct one based on the types.
C++ Example:
// Two different functions with the same name
void print(int i) {
std::cout << "Int: " << i << std::endl;
}
void print(std::string s) {
std::cout << "String: " << s << std::endl;
}
print(10); // Calls the (int) version
print("hello"); // Calls the (std::string) version
Notice that this completely depends on having declared types in function parameters! Otherwise there would be no way for the compiler to figure out which version of the function to call.
In case you’ve ever wondered why function overloading is not supported in Python, this is it! Any dynamically-typed language such as Python or Scheme, which have no declared types for function parameters, cannot possibly support function overloading because there’s no way for the interpreter to figure out which version of a function should be called.
Calling methods on an object
Here is some simple Java code like what you might have seen in your object-oriented programming class:
abstract class Animal {
abstract void speak();
void eat() { System.out.println("scarf"); }
}
class Dog extends Animal {
void speak() { System.out.println("Woof"); }
}
class Cat extends Animal {
void speak() { System.out.println("Meow"); }
void eat() { System.out.println("nibble"); }
}
// This function works with ANY Animal subtype
void makeNoise(Animal animal) {
animal.speak(); // Which speak() is called?
}
Dog fido = new Dog();
Cat skimbleshanks = new Cat();
// which eat() is called in each case?
fido.eat(); // prints "scarf"
skimbleshanks.eat(); // prints "nibble"
makeNoise(fido); // prints "Woof"
makeNoise(skimbleshanks); // prints "Meow"
Think about how the compiler or interpreter “knows” which version of the different methods to call in this example. We call this the “dispatch” — connecting the actual method call to some version of that method which has been defined.
This is actually a really sophisticated operation! There are at least three different ways it can be implemented, depending on the language and the way the method is defined
First way (least flexible, fastest): Static dispatch
As with the concept of static typing itself, static dispatch for a function or method call refers to the ability to know which version of a function is called prior to execution, e.g. at compile time.
In a language like C++ or Java, method calls can be performed by static dispatch when the compiler can figure it out based only on the declared type of the variable.
In fact, this is just an example of ad hoc polymorphism, i.e. function overloading, like we covered above! In C++ it explicitly works this way: the compiler translates a call like
fido.eat()
into something like
__eat_method(fido)
at compile-time, where the object the method is called on becomes (implicitly) the first argument to the function call.
And then, for static dispatch, the compiler can use the type of
that argument (like Dog) to decide which version of the method to
call. This is exactly the same thing as with normal function
overloading! In fact it’s one of the reasons why C++ has function
overloading even though C does not — overloading was a necessary
feature of the language in order to support objects and inheritance.
Second way (more flexible, but a little slower): Dynamic dispatch
But static dispatch only gets us so far. The challenge is that the declared type might not match the actual run-time type of the object.
This happens above in the makeNoise() function, when it calls
animal.speak(). Here the declared type of animal is Animal, but
that is not enough to figure out which version of speak() to call. In
fact, the makeNoise() function will call different functions for
speak() depending only on the type of what animal is passed into it.
This is actually a different thing called subtype polymorphism: when
the behavior of the code (in this case, the call to .speak()) depends
on which sub-class an object belongs to at run-time.
Note that the compiler has no way to resolve this method at compile-time, which means there has to be some run-time mechanism to do it. That gives us the full “power” of object-oriented programming, but comes at a cost of slightly slower performance as the precise method has to be somehow looked up at runtime each time the code is executed.
Details: virtual method table (vtable)
Let’s briefly get into the weeds on how this lookup works in order to implement subtype polymorphism.
For each class, the compiler forms a
virtual method table (or vtable for short), which will contain
pointers to all the methods that can be called on objects of that class.
Each base class like Animal doesn’t know which specific version of
each function to call, but it knows what index to look for that
function in the object’s vtable.
Let’s walk through the above example to understand this better.
The compiler first processes the Animal base class, which has two
declared methods. So it will decide speak should be index 0 in the
vtable, and eat should be index 1.
Now when the actual objects fido and skimbleshanks are created, they
each have a vtable, like this:
vtable for fido vtable for skimbleshanks
--------------- ------------------------
0: Dog::speak 0: Cat::speak
1: Animal::eat 1: Cat::eat
Now because these vtable exist at runtime, when we get to the call
animal.speak(), three things have to happen:
- The vtable for the object
animalis looked up - Go to index 0 in that table (for the
speakmethod) - Follow that function pointer to the actual code to run
All in all, we get more flexibility than static dispatch, but at the cost of one more table lookup every time a method is called.
Third way to do a method call (most flexible, but slowest): Duck typing
How does a dynamically-typed language like Python deal with subtype polymorphism? The answer is actually much simpler than above.
Each object, at run-time, stores a map of all the methods you can call on that object, by name. So essentially any object can have any method defined on it, at run-time. Class definitions are essentially a convenience for the programmer to define a bunch of methods on an object at the same time.
Implementing the method call works like the vtable method, except that instead of an indexed table we have a map with strings (method names) as keys, and with actual functions (technically closures) as values.
For example, you know that in Python you can print a string to a file like this:
fout = open("myfile.txt", "w")
print("hello", file=fout)
fout.close()
In this case, the fout variable will have type TextIOWrapper, which
is how Python represents actual open files.
But the types are not actually enforced anywhere in Python. All the
print() method really needs from the “file” you pass in, is that it
supports a write method. So this code works:
class FakeFile:
def write(self, text):
print("I'm not really writing '" + text + "' to a file!")
f = FakeFile()
print("hello", file=f)
Of course, this doesn’t actually result in the string "hello" going
into any actual file!
This idea — that we can use any object of any type, as long as it happens to have methods with the correct names defined — is called duck typing. It’s based on the saying
If it walks like a duck, and talks like a duck, it’s probably a duck.
Duck typing is a concept that fits naturally into many dynamically-typed languages with objects, such as Python. It gives ultimate flexibility for the programmer to create new classes and objects and have them work just like any built-in object in the language. The downside is the run-time cost of doing the Map lookups, which is slower than just indexing into a vtable.