|
|
|
|
|
by Aurornis
213 days ago
|
|
> Thing is that to know it is good enough you still have to collect and annotate more data than most people and organizations want to do. This has been the bottleneck in every ML (not just text/LLM) project I’ve been part of. Not finding the right AI engineers. Not getting the MLops textbook perfect using the latest trends. It’s the collecting enough high quality data and getting it properly annotated and verified. Then doing proper evals with humans in the loop to get it right. People who only know these projects through headlines and podcasts really don’t like to accept this idea. Everyone wants synthetic data with LLMs doing the annotations and evals because they’ve been sold this idea that the AI will do everything for you, you just need to use it right. Then layer on top of that the idea that the LLMs can also write the code for you and it’s a mess when you have to deal with people who only gain their AI knowledge through headlines, LinkedIn posts, and podcasts. |
|
This isn't my first CV project, but it's the most successful one. And that chiefly because my client pulled out their wallets and let an army of annotators create all the train data I asked for, and more.