Hacker News new | ask | show | jobs
by visarga 1521 days ago
It costs less than 10M to train. Why hand so much to hardware or cloud? Soon enough there will be open source GPT-3's, at least two are in training as we speak (BigScience and EleutherAI).

> these large language models also have limited business value today

The Instruct version of GPT-3 has become very easy to steer with just a task description. It can do so many tasks so well it's crazy. Try some interactions with the beta API.

I believe GPT-3 is already above average human level at cognitive tasks that fit in a 4000 token window. In 2-3 years I think all developers will have to adapt to the new status quo.

1 comments

> I believe GPT-3 is already above average human level at cognitive tasks that fit in a 4000 token window.

How can you possibly make a claim like this without like 80 links justifying it? The claim is fuzzy and absurd, my least favorite combo

Gut feeling based on playing with it. Here's an example:

> Colorless green ideas sleep furiously, and other grammatical nonsense by Noam Chomsky

He was a man without a country, A linguist without a language, A mind without a thought, A dream without a dreamer. He was lost in a world of words, A world where ideas slept furiously, And grammar was a never-ending nightmare.

But he persevered, For he knew that language was the key to understanding the world. And so he continued to study, To learn all that he could, In the hopes that one day, He would find his way home.

> Gut feeling based on playing with it

You should check out the post we're commenting on, it has graphs for this exact metric.

Spoiler: Google's model with 3x the parameters does pass average human in a couple categories, but not at all. I don't think GPT-3 does in any.

It's doubly puzzling to me because you have access and are asserting it feels like an average human to you. It's awesome and it does magical stuff, I use it daily both for code and prose. It also majorly screws up sometimes. It only at an average human level if we play word games with things like "well, the average human wouldn't know the Dart implementation of the 1D gaussian function. Therefore it's better than the average human."

> Gut feeling based on playing with it.

Ok, your phrasing made it sound like some article or material had convinced you of this opinion on my first reading, now I understand.

This is kind of my point about 80 links though - you're using a definition of "cognitive tasks" that more closely resembles knowledge, and then you're letting your personal feelings about profundity guide your conclusions on said cognition.

I don't deny that the machine can output pretty words and has a breadth of knowledge to put us each to shame on some simple queries, but "cognition in a 4000 token window" is an incredibly large place and I don't even understand how you would be able to claim a machine has above-human-average cognition based solely on your own interactions... That's a pretty crazy leap.

PS: I saw the downvotes, I was downvoted for questioning the validity of information that was actually just pure conjecture, be better with your votes