| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by datameta 1020 days ago
	My gut metric says it's a ~20% increase in perceived interpretation and output complexity, whatever that means exactly. But there are plenty of eval result aggregators out there.

2 comments

mewpmewp2 1020 days ago

To me GPT-4 seems actually intelligent and reasoning capable while GPT-3.5 does not. Many of my usecases involve giving large bodies of text to GPT and asking to reason about this. 3.5 has no clue, but 4 seems to handle it intelligently.

Overall it is as if GPT3.5 feels just like a clueless summarizer, but GPT4 intelligent interpreter and reasoner that I can trust.

Depending on which way you look at it, it could be 10x or 1000x the intelligence.

link

datameta 1020 days ago

I think trust is a key thing you've hightlighted. I find myself doubting GPT3.5, whereas not at all for GPT4.

link

muglug 1020 days ago

Yeah, there are measurable results on things like AP bio. And those are definitely not 10x.

link