Lab 4: Info Challenge
1 Overview
We will be participating in the UMD Info Challenge, in groups of 3 or 4.
Each team will focus on a single data set by a real provider, and spend the week understanding, cleaning, analyzing, and building an effective presentation from that data.
1.1 Learning goals
- Practice the full data science pipeline: acquisition, storage,
processing/cleaning, analysis, and visualization/communication
- Work on a real data set towards the goals of industrial or
government organizations
- Work within a team
2 Schedule and structure
(Dates and deadlines in bold)
Early February: Choose teams
Wednesday February 22: choose dataset
Friday February 23: Comp day (no class)
Saturday February 24 at 10am in Hopper Hall: Kickoff day, at USNA
Meet mentors, learn about datasets, and get started
Monday February 26: Work on IC during class
Tuesday February 27: Work on IC during lab
Wednesday February 28: Work on IC during class
Thursday February 29 at noon: 250-word abstracts due to IC judges. Submit your abstract here
Friday March 1: Comp day (no class)
Saturday March 2 at 9am: Project files and presentations uploaded to GitHub and link submitted: URL submission form
Saturday March 2: Travel to UMD for final presentations and awards
Monday March 4 at 2359: SD212 submission due
Tuesday March 5: Presentations and recap during lab
3 Helpful documents
- Information for IC participants
- Discord documentation
- Kickoff day schedule
- Judging day schedule
- Judging criteria
4 Discord
A lot of useful information on the event will be available via discord.
You should sign up for a (free) account if you don’t have one already, and then follow this link to join the discord server for IC24.
You can download the Discord app on any of your devices, or use the web browser interface.
When you first log in, you have to give your USNA email address to a bot, which should then give you access to the IC24 rooms as a participant.
5 GitHub
You should definitely make a GitHub repo to do your work on the Info Challenge! Look back at your notes from our recent unit in class and related homeworks if you need a reminder how to do that.
You should have just one GitHub repo per team. Once a single team member creates their GitHub repo for the info challenge, they can invite their teammates to it (as well as their instructor).
6 Grading
Your work will be judged by the IC judges for prize consideration. It will also count as a lab grade for SD212, independently of the IC contest judging.
Your SD212 grade will be based on the Info Challenge judging rubric, as scored by your instructor based on what you submit in the Markdown file, your code in GitHub, and your presentations during lab
80%: Info Challenge judging rubric, as scored by your instructor based on:
- Your answers to the questions in the markdown file (below)
- Your code uploaded to GitHub
- Your presentation during lab time
20%: Individual teamwork score based on teamwork rubric completed by all group members.
Your grade may be adjusted down by up to -25% for failure to follow instructions and meet required deadlines.
7 Questions
Please answer and have one team member (only) submit these questions prior to the SD212 submission deadline.
Who are your team’s members?
Enter the URL of the GitHub repository that contains your code and presentation materials.
Briefly describe the file organization in your GitHub repository, Where is your presentation? Where is your code and what does it do?
Say a few words about how your team worked together. Who took on the role as “team manager”? How did you organize and share your work?
For the Info Challenge project, what outside data source(s) did you incorporate?
What did you have to do to clean and processes the data?
(Include the provided datasets as well as any outside data that you found in your discussion. Just a few sentences giving the overall idea is fine.)
What did you do to analyze the data?
(Again, just a few sentences with an overall description is good.)
How did you create visualizations of your analysis?
What concrete recommendations or conclusions did you make?
What tips and suggestions do you have for next year’s Info Challenge participants?
7.1 Markdown file to fill in
Here is the file with questions to fill in and submit for today’s lab: lab04.md
You can run this wget command to download the blank md file directly from the command line:
wget "https://roche.work/courses/s24sd212/lab/md/lab04.md"
7.2 Submit your work
Submit the markdown file (with the girhub link to all of your work):
submit -c=sd212 -p=lab04 lab04.md
or
club -csd212 -plab04 lab04.md
or use the web interface