Problem Set 2

This is the archived website of SI 335 from the Spring 2014 semester. Feel free to browse around; you may also find more recent offerings at my teaching page.

Print version, with cover page (pdf)
Due Date: Monday, February 24

1 Popularity contest

An ice cream store recently sampled many flavors for its top customers. Each customer then voted for their favorite. Now the ice cream store wants to know which flavors were really popular and received more than one-third of the votes.

So here's your task: given a list of numbers A of length \(n\), find all numbers x that occur more than \(n/3\) times in \(A\). These are the popular numbers.

Note that there can be 0, 1, or 2 popular numbers. (It's impossible to have three things that all occur more than one-third of the time.)

Here's an algorithm that computes the popular numbers in an array:

def popular(A):
    '''Returns all elements that occur more than n/3 times'''
    if len(A) <= 1:
        return A
    else:
        mid = len(A) // 2
        B = A[0 : mid]
        C = A[mid: len(A)]
        apop = []
        for x in popular(B):
            if count(A,x) > len(A)/3:
                apop.append(x)
        for x in popular(C):
            if x not in apop and count(A,x) > len(A)/3:
                apop.append(x)
        return apop

You probably don't need me to write out the helper function count, but here it is just for completeness sake:

def count(A, x):
    '''Returns the number of times x occurs in A'''
    c = 0
    for a in A:
        if a == x:
            c += 1
    return c

Convince me that this algorithm is correct, i.e., it always returns all the popular elements in A.
Analyze the worst-case running time of popular(A), in terms of \(n\), the length of A. I would like a big-\(\Theta\) bound.
What programming paradigm does my algorithm employ?
Describe a faster algorithm to find the popular numbers, under the assumption that the input array A is already sorted. Your faster algorithm should run in \(\Theta(n)\) time.
CHALLENGE: Plus 1% to everyone in your section if you can come up with a \(\Theta(n)\)-time algorithm for this problem, even when A is not sorted. I promise that it's possible!

2 Tug of War

There are n Mids trying out for the varsity tug-of-war team. Some of them are strong and some of them are weak. For simplicity, assume that all the strong Mids have exactly the same strength, as do all the weak Mids. You can also assume that there is at least one strong Mid and at least one weak Mid. (In particular, n must be at least two.)

Your task is to determine who the strong Mids are, to decide who will be on the team. And the only tool you have to determine who is strong and weak is running contests. A contest involves pitting some Mids against some others in a tug-of-war, and the outcome can be either that one side wins, or the other side wins, or they tie. This pseudocode might help clarify:

# Calling this function represents a single contest.
def winner(group1, group2):
    if group1 wins:
        return group1
    elif group2 wins:
        return group2
    else:
        return 'tie'

There can be any number of Mids in any contest, but it should always be the same number on each side of the contest. The side with more strong Mids (or, equivalently, with fewer weak Mids) wins.

For example, here is an algorithm for \(n=3\):

def strongOf3(M0, M1, M2):
    w1 = winner({M0}, {M1})
    if w1 == 'tie':
        if winner({M1}, {M2}) == {M1}:
            return {M0, M1}
        else:
            return {M2}
    else:
        if winner(w1, {M2}) == 'tie':
            return w1 + {M2}
        else:
            return w1

An here's an algorithm for \(n=4\):

def strongOf4(M0, M1, M2, M3):
    w1 = winner({M0}, {M1})
    w2 = winner({M2}, {M3})
    if w1 == 'tie' and w2 == 'tie':
        return winner({M0,M1}, {M2,M3})
    elif w1 == 'tie':
        # w1 is a tie, but w2 is not
        if winner({M0}, w2) == 'tie':
            return {M0, M1} + w2
        else:
            return w2
    elif w2 == 'tie':
        # the opposite here; w1 is not a tie
        if winner({M2}, w1) == 'tie':
            return w1 + {M2, M3}
        else:
            return w1
    else:
        return w1 + w2

State the number of contests performed by strongOf3 and by strongOf4 in each of their worst cases.
Give a lower bound on the number of contests required to determine who the weak Mids are for an input of size n, in the worst case.

State your exact lower bound as a function of n, showing all your work. Then state what the asymptotic big-\(\Omega\) bound is that results, simplified as much as possible.

For example of what I'm asking for, in class we showed that sorting requires at least \(\lg (n!)\) comparisons (the exact bound), which is \(\Omega(n\log n)\).
Give an algorithm for any n that determines who the strong Mids are. In describing your algorithms, you can call the Mids by their number like an array: M[0], M[1], ..., M[n-1].

Analyze the number of contests that your algorithm performs in the worst case (NOT the number of primitive operations, just the number of contests).
Is your algorithm asymptotically optimal? Say why or why not. (Hint: there is an asymptotically optimal algorithm for this problem!)
Ultra bonus: An algorithm is exactly optimal (not just asymptotically optimal) if the number of contests is exactly equal to the lower bound, for all values of \(n\). You need to either develop an exactly optimal algorithm for the tug-of-war problem, or prove that no such algorithm could possibly exist.

3 Ethical Dilemma

There is a website called secretbook.com that offers free accounts to anyone in the world, and allows its members to send secret (encrypted) messages to each other. The way it works is, every user gets assigned a public/private key pair when they sign up for their free account, and the website publishes everybody's public key along with their username and public profile. Then any member can post encrypted messages and files with anyone else's public key, so only that person can read the message.

This website is used by all sorts of people, including political dissidents, underage kids, hackers, professors, and illegal file-sharers. The U.S. government has requested the founders of secretbook.com to let them read all the messages, presumably so they can prosecute people sharing illegal things, but the website owners have refused the request.

Here's the problem: you notice that every public key issued by the website has exactly the same modulus n. The website owners thought this was OK, since they only issue the public key \((e,n)\) and private key \((d,n)\) to its users, never the prime factors p and q of n or any other information. But you notice there is a vulnerability there...

What is the vulnerability? Describe how an attacker could do something that shouldn't be allowed, more easily than he would be able to do otherwise. Say exactly what can be done, and how easy it would be to do it.

Note: For full credit, there is an attack that can be performed in polynomial-time against this system, if you are clever enough! But you will still get most of the credit for any vulnerability you find, even if it would still require a great deal of computing power.
Regardless of whether you figure out the vulnerability for part (a), take my word that there is one that would allow you to read anyone's encrypted messages on the site. And you seem to have been the first person to notice this!

Now the question is what to do with this information. You have a few options:
- Tell everybody. Publish the vulnerability on the web so that anybody can exploit it. Hopefully the website owners are quick!
- Tell the website owners. Send a private message to the owners of the website that you have noticed this vulnerability. Maybe you give them an ultimatum (like "fix this or I'll tell more people"), or maybe you kindly request they pay you, or maybe you let them decide.
- Don't tell anybody. Keep the vulnerability to yourself, hoping no one else notices. Maybe you read some messages yourself, or maybe you don't. But no one else will know unless they figure it out too.
- Tell the government. This leaves the U.S. Government to decide which of the above options to take.
Keep in mind that if you tell anyone, it leaves the possibility that they will use this information in a way that you find unethical. For example, some of the scientists that worked on the Manhattan project were not happy that their work was used to develop a weapon of mass destruction.

Assignment: Decide, as a group, what you think would be the best option to take. Then you need to convince me, using good arguments that draw on your technical knowledge, that your option is the most ethical choice.

I do not care which option you pick, only that you come up with some solid arguments to defend your option. The only thing you can't do is weasel out of making a commitment, like "if such-and-such then I would do X, but otherwise Y, or maybe Z". Based on the information provided in this made-up scenario, make a decision and justify it.

You might want to refer to some current or historical events or laws. If so, make sure you have the proper references written down.