The One-model-fits-all fallacy

A common mistake in Machine Learning is having a "go-to" model because it usually fits the data well, which is a big assumption because it might have an entirely different distribution. It's like expecting someone of more or less the right size, say medium, to fit on all the clothes in your closet!

On the other hand, a data-centric view calls for finding a suitable model for the variations of data you can reasonably expect in a collection. It's like finding the person that can wear all the clothing in your closet. I guess that would be you 😅

Just as there's no one size fits all in clothing, with machine learning, no model fits all.

Above picture by: Serg Masís, Featured image by: Ralph Aichinger @ Wikimedia

I'm just having fun with the clothing analogy, but it always doesn't work since clothing-fitting is primarily about the person. On the other hand, the data-fitting process is more about the data than the model. Therefore, the focus should always be on the data first and foremost — data quality and consistency. Then, find the right model. After that, iterate the data, not the model.