A model that smokes everyone else

Santiago Valdarrama
3 min readMar 25, 2021


I didn’t pay too much attention when I heard about CLIP, a new neural network that learns visual concepts from natural language supervision. At some point, however, my social media feed was all about it. I mostly have to blame this guy. I had to look into it 👀 .

It immediately felt like Christmas in March (CLIP was released in January, but I was late to the party.) Here you had this new technique that kicked everyone’s butts with a zero-shot approach!

Let’s try to unpack this a little bit.

What in the world is…



Santiago Valdarrama

I build machine learning systems until 5 pm. Then I come here and tell you stories about them.