Hacker News new | ask | show | jobs
by hmottestad 542 days ago
Gotta wonder if Google has used code from internal systems to train Gemini? Probably not, but at what point will companies start forking over source code for LLM training for money?
1 comments

It seems much cheaper, safer legally and more easily scalable to simply synthesize programs. Most code out there is shit anyway, and the code you can get by the GB especially so.
How do they synthesize programs?

I would assume that internal code at Google is of higher quality than random code you find on Github. Commit messages, issue descriptions and code review is probably more useful too.