This is the archived website of SY 301 from the Fall 2016 semester. Feel free to browse around; you may also find more recent offerings at my teaching page.

Lab 7: Maps

Due before 23:59 Wednesday, October 12
Submit using the submit program as 301 lab 07.

1 Intro and Background

Maps and Python

When we store information in a Map, our data consists of two things, a key and a value. Suppose we're wanting to use our BST from last week. We have two choices, we could either modify our BST Nodes to store both a key and a value like this:

class Node:
  def __init__(self, key, value):
    self.key=key
    self.value=value
    self.left=None
    self.right=None

or, we could keep our Nodes as they were, and make a single class called KVPair which can be stored as the data field:

class KVPair:
  def __init__(self, key, value):
    self.key = key
    self.value = value

Now, we can keep our BST Nodes as they were, and just add KVPairs as data. Which is better? Well, it depends. You could argue the first approach is less convoluted. However, keeping your BST general and applicable for lots of purposes, without lots of extra fields hard-coded in is a nice thing, especially for if you're passing your code off to someone else, or coming back to it later after a few weeks away.

For this lab, we're going to make this KVPair object, and build three different kinds of maps to map alphas to midshipman names. And we're going to implement all of the functions with operator overloading so that our maps look and act just like Python's built in dict objects.

Operator Overloading

Some of you may not be happy with the above paragraphs, and might be saying, "but my BST depends upon using the < and > signs on data, and those won't work with my new, invented class!" And you'd be right. But, it turns out that if you have your own class, you can define those operators to do what you want. For example:

class KVPair:
    def __init__(self, key, value):
        self.key = key
        self.value = value
 
    def __lt__(self, other):
        return (self.key < other.key)

Now, given two KVPairs pair1 and pair2, you can run pair1 < pair2, and the right thing will happen! Cool! So, now you can use your BST as written, without changing a thing. Overloading the less-than operator even lets you sort a list full of your objects using the .sort() command.

These special operator-overloading methods in Python are called "magic methods" and there are a lot of them! The ones you will need to worry about for this lab are the comparison operators and the container type operators.

2 Part 1: KVPair class

The first thing you need to do is create a file kvpair.py that completes the KVPair class above so that all of the comparison methods work properly. Specifically, your KVPairs should be able to:

Compare by keys, meaning paira1 < pair2 should work, as should >, <=, and >=, based the key fields only;
Compare for equality, meaning pair1 == pair2 should work, based on the equality of the keys only; and
Compare for inequality, meaning pair1 != pair2 should work. (NB: If you overload ==, you should always also overload !=, or else you get weird behavior.)

After this, code like the following (which is definitely not an exhaustive testing suite) should work:

from kvpair import KVPair
 
aaron = KVPair(169998, 'Aaron Aardvark')
zeke = KVPair(160006, 'Zeke Zebra')
print(zeke < aaron)   # prints True
print(aaron < zeke)   # prints False
print(aaron == zeke)  # prints False
print(aaron >= aaron) # prints True
print(zeke != KVPair(160006, 'Same Alpha')) # prints False

3 Parts 2-4: Three versions of a Map

You're going to create three different map classes, in three different files:

sortedarraymap.py will contain the SortedArrayMap class.
unsortedllmap.py will contain the UnsortedLLMap class.
bstmap.py will contain the BSTMap class.

All of these Maps should have the same methods defined so that they appear to work exactly the same, even through we know that "under the hood" they are implemented completely differently!

Specifically, your maps have to:

Insert key-value pairs. The way this is normally done in Python is by using the [] operator, meaning for some key k and some value v, aMap[k] = v adds that pair to the map. This involves overloading the __setitem__(self, k, v) method.

This method should start by turning k and v into a single KVPair before inserting that single object into the Map.

You may assume (for now) each insert is done with a new key that is not currently in the map. If this assumption makes you uncomfortable (as it should), the correct behavior when someone inserts the same key for a second time is to overwrite the old KVPair that matches with that key.
Given a key, get the value. This is again done using the [] operator: v=aMap[k]. The way this works "under the hood" is by overloading the __getitem__(self,k) method.

You may assume (for now) that the user a competent user, who will only try to get valid keys. If this makes you uncomfortable, as it probably should, the correct behavior is to raise an Exception (a KeyError, to be precise) to notify the user when they try to lookup a key that doesn't exist in the map.
Tell you if a key appears in the Map using someKey in aMap. The "in" operator is overloaded by implementing the __contains__(self, k) operator, just like we did in last week's lab.

Your SortedArrayMap will have a list as a field, which will contain your KVPairs. Obviously, it should maintain them in sorted order so that your __getitem__ and __contains__ methods can use binary search and run in \(O(\log n)\) time.

Your UnsortedLLMap will have a head field, and each Node will be a regular linked list node with a KVPair as the data field. Your __setitem__ method should be \(O(1)\) in this case.

Your BSTMap will be very similar to the TreeSet class from last week's lab, except that each Node's data field will hold a KVPair this time. All three of the map methods should have a running time of \(O(\textrm{height})\). Note, you definitely can (and should!) use your code from last week's lab, or my posted sample solutions, as a starting point to save yourself time and effort.

All methods should run the fastest we can make them run. Keep in mind that methods that come with Pythonic lists do not know or assume that the list is sorted, and so using them may be inappropriately slow.

To be explicit, code like the following (which is in no way an exhaustive testing suite) should work:

from sortedarraymap import SortedArrayMap
 
arrMap = SortedArrayMap()
arrMap[160006] = 'Zeke Zebra'
print(arrMap[160006]) #prints 'Zeke Zebra'
print(160006 in arrMap) #prints True
print(169998 in arrMap) #prints False
 
# ... and the same for UnsortedLLMap and BSTMap, of course!

4 Going further: Overloading len()

The rest of this lab is optional for your enrichment and enjoyment. Be sure to work on this only after getting everything else perfect. You should still submit this work, but make sure it doesn't break any of the parts you already had working correctly!

In Python, the way you find out how big something is is using the len() function. Unsurprisingly, you can overload the behavior of this function for a class you write by implementing a method __len__(self): within your class.

First, make sure your Map classes can handle duplicate insertions properly. If your __getitem__ is called more than once with identical key values, then the next time __setitem__ is called with that same key, it should return the most recent value that was assigned. This should not change the running time of any of the methods, of course!

Now I want you to overload the __len__ function for all three of your Map classes from this lab. Keep in mind, the size of a Map is defined to be the number of distinct keys that have been inserted so far. For arrays and BSTs, it should be \(O(1)\) running time, and for arrays it will be especially easy!

The trickier one is with UnsortedLLMap. You could maintain a current count field in your class, but that would require checking for duplicates every time there is an insertion, which would make that method become \(O(n)\), which defeats the whole purpose of using unsorted linked lists in the first place! Instead, you should do all the work in your __len__ method, which will have a horrible running time of \(O(n^2)\).

After this, something like the following should work for any of your Maps:

from unsortedllmap import UnsortedLLMap
 
m = UnsortedLLMap()
m[160006] = 'Zeke Zebra'
print(m[160006]) #prints 'Zeke Zebra'
m[160006] = 'Susan Swan'
m[160006] = 'Mary Moose'
print(m[160006]) # prints 'Mary Moose'
print(len(m)) # prints 1