- Outline the steps to a data science analysis
- Understand the software requirements for the course
8 January 2025
This content draws upon material developed by Jeff Leek & Roger Peng for their Advanced Data Science Course at the Johns Hopkins Bloomberg School of Public Health.
What makes a good Q?
Data sources (in order of confidence):
This is usually the most time intensive step
It’s also the most important & the hardest to teach
Do the data meet your expectations?
Start with first principles
Who’s your audience? What’s your medium?
Consider these real scenarios
The goal is to explore the data
This is your “sandbox”
Soften the edges of the raw results
A refined analysis and presentation
Here are some useful questions to ask yourself
We’ll learn about research compendia and style conventions for coding and naming