SD 212 Spring 2023 / Admin


This is the archived website of SD 212 from the Spring 2023 semester. Feel free to browse around; you may also find more recent offerings at my teaching page.

6-Week Exam Information

Format

The exam is designed as a 50-minute exam but you may use as much of the 2-hour lab period as needed.

This will be a written exam with no computers/calculators/etc allowed.

The only allowed aid is one study sheet. Here are the restrictions/requirements on this study sheet:

  • One side of a single letter-sized piece of paper
  • Hand-written and prepared by you.
  • Write your name clearly on the top of the sheet. It will be handed in along with your exam (and returned back to you later).

The time you take to look over your notes, think about what to write down, and create your study sheet, is very valuable in studying and preparing for the exam — probably more than the actual info actual info will be useful to you during the exam time. For this reason, simply copying what someone else has on their sheet is probably a waste of your time.

Coverage

The 6-week exam will cover units 1–5.

The best things to review are:

  • Your own notes from class
  • The readings and examples from each unit
  • The homework assignments so far. (These are probably the best guide for the kinds of problems you can expect to see on the exam!)

A brief summary (not exhaustive!) of topics and concepts that may appear on the exam is as follows. I have intentionally grouped the concepts here slightly differently than in the notes, to make sure that you have multiple ways to review and think about the material without missing anything important.

  • Python concepts (some of this is a review from SD211!)

    • Variables
    • If and if/else statements
    • While and for loops
    • Reading files
    • Functions
    • Classes
    • Importing libraries
    • Using csv.DictReader to read a csv file line by line
    • Using Pandas, to read a csv file into a dataframe
    • Filtering rows of a Pandas dataframe according to some conditions
    • Selecting and creating columns of a Pandas dataframe
    • Using the re library, particularly the findall, search, and sub methods
    • Using the r'literal' syntax to create “raw” strings where backslashes are treated literally
    • Exceptions and the try/except syntax
  • Bash concepts

    • The structure of a command (command name, options, arguments)
    • Navigating files and directories with cd, ls, pwd
    • While-file operations mv, cp, touch, mkdir, and rm
    • Inspecting files with cat, head, tail, and wc
    • Piping between commands with |
    • Redirecting input and output with <, >, and >>
    • Filtering row-wise and column-wise with grep and cut
    • Organizing row-wise with sort and uniq
    • Modifying files with sed and tr
    • Searching and performing bulk operations with find
    • Globbing with wildcards * to operate on multiple files
    • Variables
    • for loops run some commands on multiple files
    • Exit status
    • if statements, especially the common “if/grep” construct
  • Concepts not tied to a particular programming language

    • Statistical data types: categorical vs numerical, continuous vs discrete, and ordinal vs nominal

    • Regular expressions to match lines of text

      • Literal characters
      • Universal character .
      • Anchors and boundaries ^, $, \b
      • Repeatng modifiers like *, +, ?, {4}, {3,6}
      • Character classes like [1-9abc]
      • Negated character classes like [^A-Z .]
      • Alternation with |
      • Grouping with ()
    • Regular expressions to modify text with backreferences like \1

    • What it means to “handle” an error in a program or script