SD 212 Spring 2024 / Homeworks


hw25: Fortune counting

  • Due before the beginning of class on Friday, March 29

The Data

fortune is a classic Unix command-line program to display a random short message, joke, quote, or riddle from some collection of text files.

For this homework, you will be using the fortune data files that we have mirrored here:

Notice that each file contains multiple “fortunes”, separated by a line which contains just a percent character %. The fortunes themselves can be multiple lines, but they will be separated by a line which only contains a single %.

Starter code to download and loop through one file

Here is a program which looks at one fortune file mario.arteascii and displays the first fortune in that file. It works by just reading lines and printing them out until the first time it sees a line with just a % character.

import requests

# suppress warning about insecure https
requests.packages.urllib3.disable_warnings()

resp = requests.get('http://faculty.cs.usna.edu/~roche/212/fortunes/mario.arteascii', verify=False)
resp.encoding = 'utf-8'
for line in resp.iter_lines(decode_unicode=True):
    if line.rstrip() == '%':
        # stop at the end of the first fortune
        break
    print(line)

If you download and run this as-is, you should see a nice family.

Your task

Write a multi-threaded Python program count-fortunes.py which downloads fortune files, prints how many fortunes are in each file, and then prints a grand TOTAL for how many fortunes are in all the files put together.

Your program just needs to look at these 11 fortune files, all available from the URL above:

fnames = ['love', 'magic', 'news', 'education', 'art', 'food',
          'riddles', 'science', 'cookie', 'songs-poems', 'work']

The output should look like this:

food 198
magic 30
news 53
love 150
art 465
science 625
cookie 1133
work 630
education 203
riddles 128
songs-poems 720
TOTAL 4335

When we say “look like”, what we mean is that every time you run it, the results will print in a different order because of multi-threading, but the numbers for each file and the TOTAL at the end should be the same.

Suggestions

Here is one way to approach solving this:

  • Start with the starter code that downloads one fortune file.
  • Change the file name to one of the smaller input files listed, like magic
  • Now modify the loop so that instead of printing the first fortune, it counts how many lines have a % only. Run it and check the answer!
  • Next, put this logic into a function that takes a fortune file name and prints out a line with the name and the count for that file
  • Now make a loop to run the function for all the file names given
  • Add in the logic to get a running total by using a global variable which is updated in each function call. Print this out at the end of the whole program.
  • Now your program should work and get the correct output, but it will be slow and single-threaded. So make it multi-threaded! Put each function call in its own Thread , start() them running, and print out the sum after join()ing all of the running threads.

Submit command

To submit files for this homework, run one of these commands:

submit -c=sd212 -p=hw25 count-fortunes.py
club -csd212 -phw25 count-fortunes.py