Getting started with Statistics

“I keep saying the sexy job in the next ten years will be statisticians. People think I’m joking, but who would’ve guessed that computer engineers would’ve been the sexy job of the 1990s?”, said Hal Varian chief economist at Google in 2009. These days machine learning and artificial intelligence are the sexiest fields, but their practitioners should be undercover statisticians. If you are looking for an intro into stats, this is a must-read post for you.

Warm-up readings

Charles Wheelan’s Naked Statistics is the most entertaining book about statistics, which doesn’t use any equation, but explains the main concepts through real-world examples. It is absolutely beginner-friendly and provides you just with the first steps in your journey towards mastering statistics.

David Salsburg’s The Lady Tasting Tea is the best book on the history of statistics. Salsburg tells the history of the development of the field and the modern scientific thinking without using heavy math. If you learn stats, you will learn the names of Pearson, Spearman and others soon. You’ll wonder who was Student and why he had developed this t-test and how computers overtake statistical tables and calculations on papers.

Learning by doing

The Head First series by O’Reilly is using a unique approach to teaching that is based on the cognitive science of learning. This learning method involves lots of activities, pictures, and the explanation of the same concept several times from different angles. We love the series, especially Head First Statistics by Dawn Griffiths. If you do the exercises of the book, and not just read it, you will have a solid foundation of the very basics of statistics.

Do you want to get some experience of how data analysts work? Milton’s Head First Data Analysis is the best resource for you! You’ll learn about how to use a spreadsheet to analyze data, how to clean messy real-world data, and how to put your statistical knowledge into practice.

Think Python and Stats

Allen B. Downey publishes high quality open books on computer science, statistics and complexity. Think Stats is an excellent book written for programmers. You can get the most from it if you’re a confident intermediate pythonista and you’ve already mastered the basics of statistics. Having worked through the book, you are ready to use advanced statistical Python modules.

Although Python has the built-in statistics module, it is convenient only for the most basic tasks. If you are into classical statistics, the statsmodels module is made for you.

SciPy and scikit-learn provides you a plethora of statistical and machine learning algorithms.

Advanced topics

Sources

  • The header image was downloaded from xkcd. Its source can be found here.
  • The Think Stats cover image was downloaded from this link.

Subscribe to our newsletter

Get highlights on NLP, AI, and applied cognitive science straight into your inbox.

powered by TinyLetter