SI 486I Spring 2022 / Labs


This is the archived website of SI 486I from the Spring 2022 semester. Feel free to browse around; you may also find more recent offerings at my teaching page.

Lab 2: Centralized chat-chain

In last week’s lab, you wrote programs to start interacting with a blockchain that holds chat messages, running on a central server. This week we are going to extend that functionality in a few ways, most notably by exploring the entire chain instead of just the head node, and in turn verifying the hashes on the chain are correct.

Specifically, in this lab you will:

  • Expand the functionality of your show_chat.py program:
    • Recursively retrieve all blocks in the chain back to the genesis block
    • Verify correct hashes and block structure, displaying a useful error message if a discrepancy is found
    • If the blockchain verifies, print out the sequence of chat messages from the beginning
    • Add command-line options to connect to an arbitrary host and port number
  • Use your beefed-up show_chat.py to find errors in blockchains running on some hosts
  • Write two static blockchain servers (where the head never gets updated) that respond to head and fetch GET requests, and run these servers on your VM.
    • One of your servers should present a valid blockchain with at least 2 blocks
    • The other one should have some kind of error in there, challenging your peers to identify it with their validation code!

Reminders

If you didn’t finish Lab 1

Today’s lab is directly building on last week’s lab. If you didn’t finish that lab, you need to work on that first; there’s no point in moving on until you have the complete functionality of show_chat.py and send_chat.py from lab 1.

Good news: you can get plenty of help! Don’t let yourself fall behind. See that person next to you? They probably got it working. Bother them right now and make them help you get up to speed. And if that doesn’t work, there’s a person nearby who is paid handsomely to help you succeed in this class - ask your instructor!

What you need to do

To complete this lab, you need to submit the completed Google form and push a tag lab02 to your gitlab repo that is shared with your instructor.

Remember, you can submit (part of) the form and come back to it as many times as you like before the deadline, and of course you can and should push many small updates to your git repo as you make progress.

Part 1: Reading and verifying the entire chain

Validity requirements

Recall from lab 1 that a valid node in our blockchain, for now at least, must:

  • Contain a field version with integer value 0
  • Contain a field prev_hash with the (SHA3-256 hex string) hash of the previous block
  • Contain a field payload that is a dictionary.
  • If the payload contains a chat key, that must map to a string.

Note that we don’t specify what the payload needs to be besides that it has to be a dictionary — this is to allow for later enhancements. In particular, a payload that is an empty dictionary would technically be OK.

There are two more specifications which weren’t mentioned last week, but which are quite important:

  • A block whose prev_hash is an empty string is called the “genesis block” for that chain. This is valid (but only for one block in the chain!)
  • Every block (as a JSON encoded string) must be at most 1KB. That is, the string length can be at most 1024.

Your tasks

Improve the show_chat.py program that you started last week so that it does the following.

(Note, don’t be fooled by the brevity of these steps. Each one will be some significant work, especially the first two! Work slowly and carefully, test repeatedly, and get in the habit of doing git commit/git push whenever you get something working.)

  1. Instead of only showing the most recent chat, have your program show the entire history of chat messages back to the genesis block, in order. So your program should print, on the terminal, the chat message associated with the genesis block, then the message for the next block, and so on until the most recent (head) message.

    (This will involve repeatedly making GET requests to fetch the previous node based on its hash value. Note that you will necessarily get the nodes in reverse order from how you eventually want to print them out. Figure out how to make it work!)

  2. Now add block verification as you fetch each block. You need to check the properties above are satisfied for every block in the chain.

    Whenever something fails to verify, on any block, your program should not print any chat messages but instead an error message saying what failed to verify.

  3. Add command-line arguments to your program so that it is possible to specify a hostname and port to connect to. So far, you have probably been connecting to cat on port 5000 - which is (I hope) a correct server implementation that should always verify. But try server horse on port 5001; this one is serving a blockchain that has an error and should fail to verify.

    I highly recommend using the built-in Python library argparse for this - it’s easy to use and auto-generates nice “help” messages.

Part 2: Checking on the neighbors

(Note, if you are very fast and think you have beat your peers to getting this far, maybe skip it for now and do part 3 first.)

The file hosts.txt contains the hostnames of all the student VMs (and possibly a few instructor VMs) for the class.

Your goal for this part is to write a program that checks for valid blockchain servers on each of these.

Error handling

Note, a lot of things can go wrong when you try to connect to these hosts on various ports and make a GET request to /head to get the hash of the head block. Here are a few things that can happen: * There might not be any server listening for connections at all * The server might return some HTTP error like a 404 not found. (Note, the success code for an HTTP request is 200.) * The string returned might not be a hex string * …or it might not be the correct length * …or a subsequent call to /fetch with that hash fails in some other way

In all these cases, your program should not crash; it should just calmly report that the server is not running, or that it is running and the verification failed for some reason.

To do this, you will need to use try/except blocks in Python. Here’s example of making a GET request and printing a nice message if it doesn’t work.

import requests

try:
    r = requests.get('https://ascii.co.uk/art/money', timeout=5)
except requests.Timeout:
    print("Took more than 5 seconds and I got tired of waiting")
    exit(1)
except requests.RequestException:
    print("The connection didn't work for some other reason")
    exit(1)

print("It worked! Here's what came back:")
print(r.text)

Suggestion: separating common functionality

You will probably have a lot of the verification code in common between check_peers.py for this step and show_chat.py from the last step. Rather than copying code between these files and potentially making a debugging/maintenance nightmare for yourself later if you need to fix something, put all those functions and class definitions in a common shared module like goats.py. Then you can just write something like

from goats import *

at the top of your check_peers.py and show_chat.py to be able to use those common functions and avoid code duplication. Yay!

Your task

Create a new program check_peers.py that does the following:

  1. Open the hosts file you downloaded, and repeat the following steps for each hostname in this file, and for ports 5000 and 5001 on each host

  2. Try to connect to the current hostname and port and download and verify the blockchain hosted there with a series of GET requests to /head and /fetch/HASH.

    (This is similar to your Part 1 except that you don’t need to print out the chat messages.)

  3. No matter what, your program should not crash and should print out a simple message on the terminal indicating whether:

    • There doesn’t appear to be any server running on that host/port (i.e., the GET request to /head didn’t work)
    • Some GET requests work, but the entire blockchain couldn’t be fetched (like maybe a later request to /fetch/HASH failed)
    • You fetched all the blocks, but they failed verification
    • Everything worked and was verified to be a valid block chain

    In the last case, your program should also print out how many blocks are in the chain.

Part 3: Static blockchain servers using Flask

Note: It is OK if you complete this part during next week’s lab.

Now you will host two static blockchains on your VM on ports 5000 and 5001. Both should have at least two blocks; one of them should be completely valid, and the other should have some error in the blockchain.

Flask

We will use Flask to create simple webservers in our series of labs. Flask is a Python implementation of a simple webserver, that can make it very easy to turn a small Python program into a running webserver that handles GET and POST requests dynamically.

To help jump-start you, here is complete code for a static webserver that handles GET requests to /head and /fetch/HASH, where HASH can be any string in the URL:

#!/usr/bin/env python3

import flask

app = flask.Flask(__name__)

@app.route('/head')
def head():
    return 'ffeda54e5de93487b29fa44b253ecbac7c3a295578557d44dcb520d1794339fd'

@app.route('/fetch/<digest>')
def fetch(digest):
    if digest == 'ffeda54e5de93487b29fa44b253ecbac7c3a295578557d44dcb520d1794339fd':
        return '{"prev_hash": "", "payload": {"chat": "First!!!!1111 - Dr Roche"}, "version": 0}'
    else:
        # HTTP code 400 indicates a bad request error
        return 'hash digest not found', 400

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=1234)

In fact, this is a complete and working static blockchain server with just one genesis block. Start here and expand! As you can see, the special @app.route indication before a function tells Flask that function will be used to answer GET queries at the specified URL on the server.

The last part says that, if we execute this python program from the command line, it will actually run a webserver on port 1234, accepting connections from anywhere.

Running the webserver

Let’s say you store the example above in a file called genesis.py. Then you can actually run the webserver (on port 1234) by typing

python3 genesis.py

(Or chmod to make the file executable and just run it.)

While running, it will show you when someone connects and what page they are trying to GET. Cool!

You should be able to use your show_chat.py program from before, in a separate terminal window, specifying host localhost and port 1234. That program should to connect, get the head hash, fetch that block, and display it. Try it!

Back in the original terminal window, type Ctrl-C to kill the webserver.

Keeping the webserver running

It would be awfully nice for you to keep your server running even after you log out for the day. But it’s not obvious how to make this happen, since you will obviously have to close the terminal window where you are working!

There are at least two simple ways to do this:

  • Use tmux. This is a “terminal multiplexer” program that is really useful to make a virtual terminal session that keeps going even after you log out. With this program, you can start work on one computer (logging in via SSH or x2go), start a tmux terminal, and then log out. Later, you can log in from a totally different computer, restart tmux, and pick up right where you left off.

    There are many many tmux commands but here are a few of the most useful:

    • tmux new: Start a new tmux terminal session
    • tmux neww: Open up a new window within that session (like a terminal tab). You should be able to see all the open windows at the bottom of the screen.
    • Ctrl-B n: type this sequence to move to the next window within a session
    • Ctrl-B p: move to the previous window within a session
    • Ctrl-B d: “detach” the session but leave it running
    • tmux ls: List currently running sessions
    • tmux attach: Re-open a previously “detached” session
    • exit: You can exit the session windows as normally to kill it off.

    Once you are in a tmux session window, you can run the webserver as above, then detach from that session, and the webserver will keep on running until the computer is powered down. You can re-attach to the session any time to see how it’s going, stop the server, whatever.

  • Another way to keep your server running is to use nohup. Instead of running your webserver like normal, instead preface the command with nohup, like

    nohup python3 genesis.py >genesis_log.txt &

    Notice, we also redirected the output to a log file and used & to tell the process to go into the background. If you close the terminal, this program will keep running - great!

    The trouble is that at some point you want to stop the program! Then you have to first find the process PID and then call kill on that. Here’s an example doing this:

    $ nohup python3 genesis.py >genesis_log.txt &
    [1] 1144028
    
    $ ps -ef | grep "python3 genesis.py"
    roche    1144028 1060969  0 00:51 pts/42   00:00:00 python3 genesis.py
    roche    1144193 1060969  0 00:53 pts/42   00:00:00 grep --color=auto python3 genesis.py
    
    $ kill 1144028
    [1]+  Terminated              nohup python3 genesis.py > genesis_log.txt'); ?>

    Notice that the first command actually tells us the PID. But, since you are likely to close the terminal window and forget it, you can find it again with the ps command as shown.

Your task

Create two programs static_good.py and static_bad.py which host a verified-correct and an incorrect blockchain, respectively. You should set these up to run on ports 5000 and 5001, but in either order. That is, you should decide randomly whether the “good” one runs on port 5000 or 5001.

Run your servers and leave them going until at least next week. Use your show_chat.py program to check that both servers are operating correctly.