Lab 4: Info Challenge
1 Overview
We will be participating in the Info Challenge hosted at UMD, in groups of 3 or 4.
Each team will focus on a single data set by a real provider, and spend the week understanding, cleaning, analyzing, and building an effective presentation from that data.
1.1 Learning goals
- Practice the full data science pipeline: acquisition, storage,
processing/cleaning, analysis, and visualization/communication
- Work on a real data set towards the goals of industrial or
government organizations
- Work within a team
2 Schedule and structure
(Submission deadlines in bold)
Early February: Choose teams
Friday February 24: Comp day (no class)
Saturday February 25: Kickoff day, at USNA
Meet mentors, learn about datasets, and get started
Monday February 27: Work on IC during class
Tuesday February 28: Work on IC during lab
Wednesday March 1: Work on IC during class
Thursday March 2 at noon: 250-word abstracts due to IC judges
Friday March 3: Comp day (no class)
Saturday March 4 at 9am: Project files and presentations uploaded to GitHub and link submitted: instructions and URL submission form
Saturday March 4: Travel to UMD for final presentations and awards
Monday March 6 at 2359: SD212 submission due
Tuesday March 7: Presentations and recap during lab
3 Grading
Your work will be judged by the IC judges for prize consideration. It will also count as a lab grade for SD212, independently of the IC contest judging.
Your SD212 grade will be based on the Info Challenge judging rubric, as scored by your instructor based on what you submit in the Markdown file, your code in GitHub, and your presentations during lab
80%: Info Challenge judging rubric, as scored by your instructor based on:
- Your answers to the questions in the markdown file (below)
- Your code uploaded to GitHub
- Your presentation during lab time
20%: Individual teamwork score based on teamwork rubric completed by all group members.
Your grade may be adjusted down by up to -25% for failure to follow instructions and meet required deadlines.
4 Questions
Please answer and have one team member (only) submit these questions prior to the SD212 submission deadline.
Who are your team’s members?
Enter the URL of the GitHub repository that contains your code and presentation materials.
Briefly describe the file organization in your GitHub repository, Where is your presentation? Where is your code and what does it do?
Say a few words about how your team worked together. Who took on the role as “team manager”? How did you organize and share your work?
For the Info Challenge project, what outside data source(s) did you incorporate?
What did you have to do to clean and processes the data?
(Include the provided datasets as well as any outside data that you found in your discussion. Just a few sentences giving the overall idea is fine.)
What did you do to analyze the data?
(Again, just a few sentences with an overall description is good.)
How did you create visualizations of your analysis?
What concrete recommendations or conclusions did you make?
What tips and suggestions do you have for next year’s Info Challenge participants?
4.1 Markdown file to fill in
Here is the file with questions to fill in and submit for today’s lab: lab04.md
You can run this wget command to download the blank md file directly from the command line:
wget "https://roche.work/courses/s23sd212/lab/md/lab04.md"
4.2 Submit your work
Submit the markdown file (with the girhub link to all of your work):
submit -c=sd212 -p=lab04 lab04.md
or
club -csd212 -plab04 lab04.md
or use the web interface