3 minute read

Good data books are hard to find. Here is a list of recommendations for some great data books that I have enjoyed and found to be very thoughtful. I am in no way a book critic and this is not where you get a detailed breakdown of each chapter. These books are entertaining to read and help you develop a better sense of how to think about our world filled with data:


The Data Detective: Ten Easy Rules to Make Sense of Statistics

The Data Detective Tim Harford presents 10 ways in which one should approach a new piece of information that they received. For example we should focus on the actual numbers and facts, rather than how they make us feel. We are surrounded by information and facts that surprise or are contradictory to what we believe are more likely to grip our attention or get published. This book is as much about psychology as it is about numbers and will therefore be a good fit for a wide audience. It is full of entertaining anecdotes to help the reader digest and understand the presented concepts.


Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are

Everybody Lies This is a good book about a gentle introduction into the world of big data. Written by Seth Stephens-Davidowitz, a Harvard-trained economist and former Google data scientist he provides a layman-introduction about how to think about data. The author dives deep into how online behaviour can be used to make conclusions about larger populations by citing examples from Google history. This is not a technical book, but an entertaing data read and one that can be recommended for non-data literate folks.


How Not to Be Wrong: The Power of Mathematical Thinking

How Not To Be Wrong This book is advertised as one of Bill Gates’s “10 favourite books” and I did enjoy reading it as well (this is more of a statement about book quality, than my own book taste). The author explains a lot of mathematical concepts with easy to understand use cases and anecdotes. This is the kind of book that you could recommend to someone who might not have a heavy mathematical background and they will still understand it.


Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy

Weapons of Math Destruction This book is really famous among data scientists and made Cathy O’Neill a household name. It presents numerous example how blind trust into models leads to an overall worse outcome for marginalized groups. I have often encountered blind faith into models in my career and this book reminds you to question the outcomes and the assumptions that went into the development. Examples from Cathy O’Neills book are often quoted when people discuss the ethical obligations of data scientists. This book is an absolute must-read for anyone who makes models that affect people.


Superforecasting: The Art and Science of Prediction

Superforecasting I loved this book! Philip E. Tetlock and Dan Gardner present their research into what makes people better predictors of the future. They turn people into prediction models by teaching them the tools to think about the future and make them better estimators of events. They lead the Good Judgement project project where anyone can sign up and make long term predictions about the future. This jives well with my past research into ensemble systems, since a combination of 100s of predictions tends to be more accurate than a single strong predictor. My personal favourite takeaway from this book is to always start with the base rate for an event. The base rate applies to any event whether you are trying to predict whether a married couple will stay together (average population divorce rate), whether the CO2 levels will rise next month (by examining historical data), or the likelyhood of getting mugged in a certain part of town (crime statistics). Additional factor can then move the prediction one way or another once the base rate is defined.


All of these are excellent books that and deal with data science topics without going into the mathematical
concepts (too deep), which makes them great bedtime reading. That does however not diminish their value since sometimes we have to zoom out and think of the larger picture before we dig in and apply statistical models.

Updated: