| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by gaogao 237 days ago
	Yeah, I don't think surrealism or constructed is good in the early data mix, but as part of mid or post-training seems generally reasonable. But also, this is one of those cases where anthropomorphizing the model probably doesn't work, since a major negative effect of Cocomelon is kids only wanting to watch Cocomelon, while for large model training, it doesn't have much choice in the training data distribution.

1 comments

f_devd 237 days ago

I would a agree a careful and very small amount of above brainrot in post-training could improve certain metrics, if the main dataset didn't contain any. But given how much data current LLMs consume and how much is being produced and put back into the cycle I doubt it will miss be missed

link