The Statistics You Actually Need for Data Science
You don't need a graduate degree in statistics to be useful with data. You need a tight set of concepts you truly understand and can apply.
The 12 essentials
- Mean, median, mode — and when each one lies to you.
- Variance and standard deviation — measuring spread.
- Distributions: normal, log-normal, Poisson, power-law.
- Central Limit Theorem — why sample means behave nicely.
- Sampling and sampling bias — the silent killer of analyses.
- Confidence intervals — uncertainty made explicit.
- Hypothesis testing — t-tests, chi-square, and how to read a p-value.
- A/B testing — the applied version of hypothesis testing.
- Correlation vs causation — the eternal warning.
- Linear regression — the foundation of most predictive modeling.
- Logistic regression — and the meaning of odds ratios.
- Bayes' theorem — updating beliefs as new data arrives.
A common pitfall
A p-value of 0.04 doesn't mean 'there's a 96% chance the effect is real'. It means: 'if there were no effect, we'd see data this extreme 4% of the time'. Mis-reading p-values causes more bad business decisions than almost any other statistical mistake.
How to study these
Pick one per week. Write a 5-line notebook that demonstrates it on real or simulated data. That's it. Twelve weeks, twelve notebooks, and you'll have a working statistical intuition most engineers never develop.
Recommended Reading

Python for Data Analysis
Wes McKinney (3rd Edition, O'Reilly)
The definitive guide to pandas, NumPy, and the modern Python data stack — written by the creator of pandas himself.
View on Amazon
Hands-On Machine Learning
Aurélien Géron (3rd Edition, O'Reilly)
From linear regression to deep neural nets with Scikit-Learn, Keras and TensorFlow. The most recommended ML book of the decade.
View on Amazon