🎄 One of the joys of the holiday season is to cozy up with a good book and read it pretty much non-stop! One year ago, that book was "The Book of Why". Now, it was "Causal Inference: The Mixtape". Do you notice a theme?
🤖 I've made it no secret that I love causal inference. The reason for this is that, as it is practiced today, machine learning is mostly brute-force and correlation-based — Under ideal circumstances, it will work yet raise concerns with resource consumption, interpretability, and generalization. However, I believe the Bayesian/Causal approaches hold the answers to these problems, which is why I think every data scientist should put Causal Inference in their wheelhouse, at the very least, to gain another perspective!
📖 "Causal Inference: The Mixtape" by Scott Cunningham is an excellent overview of this massive research topic and a delightful read. I highly recommend it.
👏 The Good: It covers a lot of ground and explains it clearly with real-world examples. It starts with a comprehensive review of probability and regression and then devotes each subsequent chapter to essential topics in causal literature, from Direct Acyclic Graphs to Instrumental Variables. It starts each chapter with amusing lyrics from rap and hip-hop songs. And in addition to tons of mathematical formulas, tables and plots, it comes with ample code in R and Stata (economists' proprietary language of choice).
🤔 The Bad: No Python code. I think the author underestimates how much need there is for Causal Inference in industry where Python is king among data practitioners. However, this shouldn't discourage Pythonista data scientists since enjoying the book isn't dependent on the code. Also, since econometricians and research social scientists largely dominate the Causal Inference field, it's no wonder that examples are drawn from these areas. If you want to apply methodologies discussed in the book in industry, I suggest complementing other reading material adaptable for your use cases.
🙈 The Ugly: Not a big deal for experienced folks, but the provided code is not documented nor explained, making it hard to follow for any newcomer to R or Stata.