Hacker News new | ask | show | jobs
by throwthrowuknow 770 days ago
That is a problem but thankfully there is a lot of attention on training with highly curated high quality data right now because it is a known problem. Buggy code is still valuable training data if you use it as part of a question and evaluate the response against a corrected version of the code when training the model to perform a task like bug fixing.