Member-only story
The story behind a baseline
Finding whether your machine learning model is providing any value is not that simple. Yeah, of course, the loss is going down, and accuracy is through the roof, but that’s not enough.
Is this thing actually any good?
Yes, I’m one of those who had bragged before about a model that did worse than a pair of nested if-else conditions. Focus too much on the trees 🌳 , and you’ll certainly miss the forest.
Let’s get that fixed.
How good am I at doing this thing?
I’ve been building traditional software my entire life, and there’s something nice about it: it either works, or it doesn’t.
When you are building a machine learning model, things change quite a bit. Models make predictions, and understanding their quality requires a little bit more setup.
So I have an entire process before I start writing code. I like to start by measuring how good I am at solving the task. Manually. Like an animal 🐘 😎.
This is called a human baseline, and ideally, my model will have super-human abilities and kick my butt at some point. In the meantime, this baseline represents my North Star ✨.
An interesting outcome of trying to come up with a human baseline is finding out when I can’t do the task…