Hacker News new | ask | show | jobs
by andrepd 1352 days ago
> Because the inferences and reasoning GPT3 is already capable of is incredible and beats most expert systems that I know of.

This is patently FALSE. You can, however, re-run a given prompt 10+ times, tweaking and nudging it into the direction you know you want, until it produces a seemingly miraculously deep result (by pure chance).

Rinse and repeat a dozen times and you have enough material for a twitter thread or medium post fawning over gpt-3.

1 comments

I don't necessarily doubt you but can you give me an example of an expert system that is more capable ?
GPT3 can’t perform algebra over all 32 bit numbers. A trivial Python script can.
It behaves more like your nephew than a computer in that case. Interesting that this is often the example given for why computers are bad at certain tasks, and humans are good at others.

It is quite incredible that nothing changed about the architecture in gpt-2 vs gpt-3 (just way more connections), yet it aquired fundamentally new behavior - that if performing arithmetic calculation - despite not having large amounts of training data on the subject. I think this is the type of phenomenon that shows we are quite poor at estimating what these systems will be capable of when scaling up. So acting as if we're sure it won't lead to improvements in AI is as idiotic as claiming that it will. There are far too many people on hacker news that follow this fad of being dismissive of AI, because they make the common mistake of equating cynicism with intelligence.

It’s smoke and mirrors trying to fool you into thinking it’s generating intelligent text. In some applications e.g., a chatbot, that’s appropriate. But it’s really no comparison to an expert system for most applications, where you know exactly the right and wrong solutions. Not adding numbers correctly with the huge budget GPT3 has for training and inference is a poignant case of that fact. A linear layer taking in x and y will learn x+y just by setting the weights to 1.0, so it’s not even a hard problem for neural nets, just in the particular tokenization and architecture used for GPT models.