Hacker News new | ask | show | jobs
by selfmodruntime 295 days ago
Never before did we have a combination of well and poison where the pollution of the first was both as instantaneous and as easily achieved.

I‘ve yet to see a convincing article for artificial training data.

2 comments

It does seem like it helps with math, but in a way that demonstrates the futility of the enterprise: "after training the LLM on 10,000,000 examples of K-8 arithmetic it is now superhuman up to 12 digits, after which it falls off a cliff. Also it demonstrably doesn't understand what 'four' means conceptually and it still fails on many trivial counting problems."
yeah like another commenter said, if you can get synthetic data with some some sort of easily verifiable grounding (math, games, code) models can do very well. this is one of the underpinnings of reinforcement learning that has helped some advancements in past year or so (AFAIK)