| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lexicality 61 days ago
	The entire point of LLMs is that they produce statistically average results, so of course you're going to have problems getting them to produce non-average code.

2 comments

Bayano2 61 days ago

This was true circa GPT2, less true after RLHF and not true at all after RLVR. It's trying to model the distribution of outputs most likely to solve the problem, not the average distribution.

link

anuramat 61 days ago

they (are supposed to) produce average on average, and the output distribution is (supposed to be) conditioned on the context

link

disgruntledphd2 61 days ago

Yeah but ultimately it's all just function approximation, which produces some kind of conditional average. There's no getting away from that, which is why it surprises me that we expect them to be good at science.

They'll probably get really good at model approximation, as there's a clear reward signal, but in places where that feedback loop is not possible/very difficult then we shouldn't expect them to do well.

link

anuramat 57 days ago

true, but it's the same with humans, we suck at problems with sparse/delayed feedback, which includes science (math would be the exception I guess)

sure, humans are obviously better at dealing with it, but the one thing nobody is claiming is "scientists replaced by 202X"

link