SI 486I Spring 2022 / Notes


This is the archived website of SI 486I from the Spring 2022 semester. Feel free to browse around; you may also find more recent offerings at my teaching page.

Unit 3: Cryptographic hashing

1 Summary and Goals

We will look more closely at the most important cryptographic tool to make blockchains work: hashing. Hashing is easy to describe at a very high level, but carries with it many seeming paradoxes. (Hashes are unique, yet collisions must exist? Hash digests are random, yet predictable?)

You probably have a basic idea of what Bitcoin is supposed to be, a kind of digital cash that can be bought, sold, and exchanged entirely online outside any one government’s control. But what does that actually look like at a low level, and what were the big challenges that needed to be solved?

We will spend many weeks of this class understanding these questions in detail, but the goal of this unit is to give you a high-level picture of how the Bitcoin network operates and what a blockchain is supposed to do.

2 Required reading

  1. Chapter 1 intro and section 1.1 of Narayanan et al, which is pages 23-30 of this pdf. You can skip the parts about SHA256 at the end.
  2. Keccak technical description

The book chapter provides a good overview of what a hash function is and (more importantly) what properties we need for cryptocurrencies in particular. So even if you already know about hash functions, this is still important to get into.

The second reading is about Keccak, a family of hash functions that “won” the NIST competition to define SHA3. In our labs, we will use the SHA3-256 hash function, which is a version of Keccak. The link is to a web page describing the different phases of the hash algorithm with some pseudocode.

3 Take-aways

Here are some questions you should be able to answer after doing this reading:

  • What is the difference between a cryptographic hash function, and the “regular” hash functions you might use in something like Java’s HashMap?
  • What are the three main security properties needed of a hash function for blockchains?
  • Suppose someone proves that the SHA3-256 hash function has a collision (but doesn’t say what the collision is). Does this mean that SHA3-256 should no longer be considered secure?
  • How big are the “lanes” used in SHA3-256 (which is Keccak with b=1600)? How many lanes are there?
  • What kinds of bitwise operations are used inside the Keccak algorithm to mix up the data?