Hacker News new | ask | show | jobs
by CamperBob2 39 days ago
You probably don't need knowledge about Pokemon or the Diamond Sutra in your enterprise coding LLM.

That's one of the biggest remaining head-scratchers in this whole business. You do need all that unrelated stuff to make a good coding model.

Nobody knows why you can't build a coding model by training on nothing but code, CS texts, specifications, and case studies, but so far it appears that you can't.

1 comments

This one is kind of obvious - because people prompt coding LLMs with natural language. That's unrelated to stuffing the pre-train set with trivia factoids.

An LLM that knows English very well isn't actually very large and certainly not hundreds of billions of parameters.