I'm working on a new chapter on interpreting NLP Transformers for the 2nd edition of my book. This topic was the most popular one in a poll I conducted on LinkedIn. But, to be honest, initially, I wasn't sure about doing it.
🔥 Transformers get a whole lot of attention (no pun intended). However, while solving the bottleneck problem recurring neural networks (RNN) have, they have unleashed an arms race to train ever-increasing larger models with even bigger corpora — and from the Internet. I'm not convinced this is the way to go because language is riddled with bias. So all the more reason to learn to interpret these models! Because only through interpretation can we understand the strengths and weaknesses of a model and develop strategies to solve them.
👍🏼 And with this thought, I convinced myself to include this chapter, and I'm very excited about how it's turning out! (and so is my furry four-legged assistant)