a lot of things are harder than they look.
i’m very confident someone can prove you wrong, without being an expert in the field.
To get even more of them I could consider gamification. This game is a good example: https://gandalf.lakera.ai/
Once I get a descent dataset, I could use it to finetune a LLM to do classification. Or play with embeddings and cosine similarity and similar.
I could also use LLMs to extend the training dataset, and have some human feedback.
It’s maybe not the best strategy and I’m sure someone else can do it better but I don’t think it’s wrong.
while interesting, your napkin math isn’t convincing.
a lot of things are harder than they look.