Y
Hacker News
new
|
ask
|
show
|
jobs
by
vanuatu
17 days ago
all the labs "clean" their pretraining data, and you can have your pretraining data to be minimally ai generated but also spam synthetic post-training data