What is data science?
6 January 2025
What is data science?
If at first you don’t succeed,
call it version 1.0!
Math and statistics
Computer science
Knowledge about the system
Communication
Leek & Peng (2015) Science
Leek & Peng (2015) Science
Summarizes the information in a single data set without further interpretation
ex: the US Census
Leek & Peng (2015) Science
Searches for trends, correlations, etc among multiple variables
ex: “data mining”
Leek & Peng (2015) Science
Generalizes to the population from a sample
ex: wearing a mask reduces the spread of COVID
Uses a subset of the data to parameterize a model and predicts out-of-sample data
ex: using weather to predict recreational fishing effort
Leek & Peng (2015) Science
Seeks to understand the average magnitude and direction of a response
ex: experimental analysis of temperature effects on oyster growth
Seeks to understand the specific magnitude and direction of relationships
ex: engineering studies
Although you may have spent weeks/months on an analysis, most people only want the bottom line
The key is weaving everything into a nice story to achieve both trust and belief
Do you accept the analysis per se?
Were the data analyzed properly and thoroughly?
Trust is specific to the analysis/analyst (internal)
Do you believe the analysis?
Belief is more broadly related to previous work and other factors (external)
The things you did and presented
The things you did but did not present
The things you did not do
The trick is to strike a balance to achieve trust and belief