Hacker News new | ask | show | jobs
by janussunaj 18 days ago
It's also pretty sad that now "ML engineer" means prompting...
2 comments

On the one hand I understand this fairly deeply.

I started doing "ML" ~ 20 years ago building classifiers people would laugh at today and even at the time barely impressed people when they were 95% correct.

I moved into NLP and built NERs that missed 2-10% of named entities per document routinely. Best of breed approaches and models rarely fared better.

Learned the cornerstones in school for ML; linear regression, ANNs, traditional RL, image classifiers, A* bots, etc, most of which got baked into transformers later on.

Then the transformers went from interesting novelty to useful. I couldn't build a useful one locally, but the toys versions were still fun to play with.

Then the novelty LLM went from useful to generally applicable. Then they became a silver bullet.

I still can't build one locally. I can distill or build or fine tune if you give me some rented GPUs. But to call this ML is very much a stretch.

I still use the traditional ML a lot, but mostly for evals and analysis.

I get being naturally bummed by this but I can't justify feeling anything but vaguely nostalgic about it. Someone with a $20 subscription can mog anything I can build with the skills I picked up.

If someone hands you a silver bullet you'd be a fool to decline it and spend your time hand casting a crude piece of brass. If the difference between 95% and 99% means you know how to aim or oil the gun, that's the world we live in.

Building a good RAG pipeline or prompt optimization or LLM consensus is dumb stuff that produces a better result than anything I could do from my 2010 ML/AI textbooks. I don't lack the knowledge or capacity to compete, I lack the compute.

That's the job now for 99% of companies.

An acquaintance who works at FAANG says he still builds non-LLM ML systems because of the cost of running LLMs at that scale
Exactly. Besides cost, domain-specific models (which can still be very large) encode our biases (i.e., our knowledge about the domain) in their architecture. Because of that, we have ways to calibrate their accuracy trade-offs over an in-domain sample.

For LLMs, there is a disconnect between the perceived domain (anything a human can think and verbalize) and actual domain (word sequence prediction). We only know how to sample from the latter, not the former.

This "silver bullet" idea sounds a lot like "free lunch". There has been a lowering of the bar for ML practices to make way for this onslaught of prototype-level productivity. Teams that used to do their best to run uncorrelated evals are now having the prompt engineers manually inspect ~100 model outputs before launch.

People were shocked by the accuracy of n-gram models and user interaction data to "read their mind" and complete their searches. Now we're obviously all impressed by the emerging abilities of LLMs (integrated with lots of business logic by the LLM providers). Hopefully in time we'll get desensitized a bit and have the right mental model when using these tools.

P.S. I love your username :)

> I get being naturally bummed by this but I can't justify feeling anything but vaguely nostalgic about it. Someone with a $20 subscription can mog anything I can build with the skills I picked up.

Welcome to the data science job market of the 2015-2023 where everybody with a $20 online course could become a proficient data scientist in only 4 weeks!

Exactly. Not 4 years ago I was rejected from a job for not having enough NLP experience. Can you imagine that today? Someone being hired to do NLP in the market of LLMs?