| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mewpmewp2 697 days ago
	Yeah I would say it actually makes the contrary point. That pre hype version of the GPT is poor and if you have to use this one to prove a point it probably means there is a huge jump between GPT3 and GPT4. So to me it proves the contrary. And anybody going for that or believing it doesn't actually understand the performance of GPT4 or better if they are thinking that this is post hype LLM output.

1 comments

pona-a 695 days ago

Well, what if it just got better at covering up human-presentable cases?

See this comment [0] on this very post, showing how it makes quite problematic mistakes on larger numbers still.

It's still improvement, but only in the way of imitation. It shows that while clever within their constraints, these models still don't have the capabilities to truly perform computation or "thought". Chain of thought can help, but you there are some things you cannot split into atomic tasks; if the very world model isn't that stellar, no amount of elucidation will compensate for the inaccurate representations within. (i.e. "How would person X react to Y?" If your theory of mind is poor, no amount of further subtasks will help you give a better prediction.)

[0] https://news.ycombinator.com/item?id=41092987

link

mewpmewp2 695 days ago

For larger numbers it just needs to execute code. Most people also can't calculate such numbers in their head.

It shouldn't have to be able to do things it knows how to use code for. E.g. dumb thing slike how many Rs in a strawberry. It doesn't even see characters, so even if it was somehow possible, it couldn't count for sure.

It is like asking someone who only has ever seen hieroglyphs how many Rs are in a character by character version of strawberry.

link

pona-a 695 days ago

Still, let's not anthropomorphize computational processes. It is a function approximate, which we'd expect to pick up on simple patterns like intersections or base10 arithmetic. When we see its predictions diverge from truth, that shouldn't be disregarded with a "just so" story, this is a sign we're pushing the architecture to its limits.

link