Lab 7: Maps
- Due before 23:59 Wednesday, October 12
- Submit using the submit program
as
301 lab 07
.
1 Intro and Background
Maps and Python
When we store information in a Map, our data consists of two things, a key and a value. Suppose we're wanting to use our BST from last week. We have two choices, we could either modify our BST Nodes to store both a key and a value like this:
1 2 3 4 5 6 | class Node: def __init__(self, key, value): self.key=key self.value=value self.left=None self.right=None |
or, we could keep our Nodes as they were, and make a single class called KVPair which can be stored as the data field:
1 2 3 4 | class KVPair: def __init__(self, key, value): self.key = key self.value = value |
Now, we can keep our BST Nodes as they were, and just add KVPairs as data. Which is better? Well, it depends. You could argue the first approach is less convoluted. However, keeping your BST general and applicable for lots of purposes, without lots of extra fields hard-coded in is a nice thing, especially for if you're passing your code off to someone else, or coming back to it later after a few weeks away.
For this lab, we're going to make this
KVPair object, and build three different kinds of maps to map alphas to
midshipman names. And we're going to implement all of the functions with
operator overloading so that our maps look and act just like Python's
built in dict
objects.
Operator Overloading
Some of you may not be happy with the above paragraphs, and might be saying, "but my BST depends upon using the < and > signs on data, and those won't work with my new, invented class!" And you'd be right. But, it turns out that if you have your own class, you can define those operators to do what you want. For example:
1 2 3 4 5 6 7 | class KVPair: def __init__(self, key, value): self.key = key self.value = value def __lt__(self, other): return (self.key < other.key) |
Now, given two KVPairs pair1
and pair2
, you can run
pair1 < pair2
, and the right thing will happen! Cool! So, now you
can use your BST as written, without changing a thing. Overloading the
less-than operator even lets you sort a list full of your objects using the
.sort()
command.
These special operator-overloading methods in Python are called "magic methods" and there are a lot of them! The ones you will need to worry about for this lab are the comparison operators and the container type operators.
2 Part 1: KVPair class
The first thing you need to do is create a file kvpair.py
that completes the KVPair
class above so that all of the comparison methods work properly. Specifically,
your KVPairs should be able to:
- Compare by keys, meaning
paira1 < pair2
should work, as should >, <=, and >=, based the key fields only; - Compare for equality, meaning
pair1 == pair2
should work, based on the equality of the keys only; and - Compare for inequality, meaning
pair1 != pair2
should work. (NB: If you overload ==, you should always also overload !=, or else you get weird behavior.)
After this, code like the following (which is definitely not an exhaustive testing suite) should work:
1 2 3 4 5 6 7 8 9 | from kvpair import KVPair aaron = KVPair(169998, 'Aaron Aardvark') zeke = KVPair(160006, 'Zeke Zebra') print(zeke < aaron) # prints True print(aaron < zeke) # prints False print(aaron == zeke) # prints False print(aaron >= aaron) # prints True print(zeke != KVPair(160006, 'Same Alpha')) # prints False |
3 Parts 2-4: Three versions of a Map
You're going to create three different map classes, in three different files:
sortedarraymap.py
will contain theSortedArrayMap
class.unsortedllmap.py
will contain theUnsortedLLMap
class.bstmap.py
will contain theBSTMap
class.
All of these Maps should have the same methods defined so that they appear to work exactly the same, even through we know that "under the hood" they are implemented completely differently!
Specifically, your maps have to:
Insert key-value pairs. The way this is normally done in Python is by using the [] operator, meaning for some key
k
and some valuev
,aMap[k] = v
adds that pair to the map. This involves overloading the__setitem__(self, k, v)
method.This method should start by turning k and v into a single KVPair before inserting that single object into the Map.
You may assume (for now) each insert is done with a new key that is not currently in the map. If this assumption makes you uncomfortable (as it should), the correct behavior when someone inserts the same key for a second time is to overwrite the old KVPair that matches with that key.
Given a key, get the value. This is again done using the [] operator:
v=aMap[k]
. The way this works "under the hood" is by overloading the__getitem__(self,k)
method.You may assume (for now) that the user a competent user, who will only try to get valid keys. If this makes you uncomfortable, as it probably should, the correct behavior is to raise an Exception (a
KeyError
, to be precise) to notify the user when they try to lookup a key that doesn't exist in the map.Tell you if a key appears in the Map using
someKey in aMap
. The "in" operator is overloaded by implementing the__contains__(self, k)
operator, just like we did in last week's lab.
Your SortedArrayMap will have a list as a field, which will contain your
KVPairs. Obviously, it should maintain them in sorted order so that your
__getitem__
and __contains__
methods can use binary search
and run in \(O(\log n)\) time.
Your UnsortedLLMap will have a head field, and each Node will be a regular linked list node
with a KVPair as the data field. Your __setitem__
method should be \(O(1)\) in this
case.
Your BSTMap will be very similar to the TreeSet class from last week's lab, except that each Node's data field will hold a KVPair this time. All three of the map methods should have a running time of \(O(\textrm{height})\). Note, you definitely can (and should!) use your code from last week's lab, or my posted sample solutions, as a starting point to save yourself time and effort.
All methods should run the fastest we can make them run. Keep in mind that methods that come with Pythonic lists do not know or assume that the list is sorted, and so using them may be inappropriately slow.
To be explicit, code like the following (which is in no way an exhaustive testing suite) should work:
1 2 3 4 5 6 7 8 9 | from sortedarraymap import SortedArrayMap arrMap = SortedArrayMap() arrMap[160006] = 'Zeke Zebra' print(arrMap[160006]) #prints 'Zeke Zebra' print(160006 in arrMap) #prints True print(169998 in arrMap) #prints False # ... and the same for UnsortedLLMap and BSTMap, of course! |
4 Going further: Overloading len()
The rest of this lab is optional for your enrichment and enjoyment. Be sure to work on this only after getting everything else perfect. You should still submit this work, but make sure it doesn't break any of the parts you already had working correctly!
In Python, the way you find out how big something is is using the
len()
function. Unsurprisingly, you can overload the behavior
of this function for a class you write by implementing a method
__len__(self):
within your class.
First, make sure your Map classes can handle duplicate insertions properly.
If your __getitem__
is called more than once with identical key
values,
then the next time __setitem__
is called with that same key, it should
return the most recent value that was assigned.
This should not change the running time of any of the methods, of course!
Now I want you to overload the __len__
function for all three of your
Map classes from this lab. Keep in mind, the size of a Map is defined to be the number of
distinct keys that have been inserted so far.
For arrays and BSTs, it should be \(O(1)\) running time, and
for arrays it will be especially easy!
The trickier one is with UnsortedLLMap. You could maintain a current count field in your
class, but that would require checking for duplicates every time there is an insertion,
which would make that method become \(O(n)\), which defeats the whole purpose of using
unsorted linked lists in the first place! Instead, you should do all the work in your
__len__
method, which will have a horrible running time of \(O(n^2)\).
After this, something like the following should work for any of your Maps:
1 2 3 4 5 6 7 8 9 | from unsortedllmap import UnsortedLLMap m = UnsortedLLMap() m[160006] = 'Zeke Zebra' print(m[160006]) #prints 'Zeke Zebra' m[160006] = 'Susan Swan' m[160006] = 'Mary Moose' print(m[160006]) # prints 'Mary Moose' print(len(m)) # prints 1 |