Hacker News new | ask | show | jobs
by derefr 2165 days ago
> Sometimes I forget that, while this model was created by scientists, and released with a scientific paper, it is essentially a for-profit business product, and such cheap tricks deserve harsh criticism.

Sure, but this is akin to seeing bad science journalism and tarring the science itself with the same brush. GPT-3 still factually has certain properties, independently of anyone making grandiose assertions about those properties.

What those properties are, we can only say slightly—e.g. we know it’s capable of generating certain texts eventually, among an unbounded corpus of other texts it may have generated that were then human-discarded. But the fact that it can generate those texts at all—faster than brute-force, I mean—is an interesting fact on its own, worthy of scrutiny independent of whatever airier claims are being made.

1 comments

It is certainly impressive, and I don't want to discard GPT-3. Just critiquing the (smart) release: make a select few feel special by giving them API access, and watch your product dominate the tech - and news cycle for weeks. You'll have VC money in the bank before showing actual worth or business value.

Maybe a bit simplistic, but I view GPT as a Markov chain text generator, operating on word vectors instead of word tokens, and having a larger look-back. It's like a child copying a joke, because she heard adults laughing about it, but she does not understand the punchline. You wouldn't say that child understands or even displays humor, despite substituting "horse" with "donkey" when retelling the joke.

If you want to play with GPT-3, you can do so right now.

Go to https://play.aidungeon.com Make an account, and select the "Dragon" model. That's GPT-3.

I've spent ten hours playing with it over the last two days. It isn't perfect, and it feels short of the hype it's generating about itself, but it's an amazing leap nonetheless. It really seems to have an understanding of causality, biology, all sorts of fictional themes...

It isn't perfect. You frequently have to back it up and try again. Unless you make good use of the site's long-term memory function, it'll forget anything that happened over a page ago, and a lot of the time its idea of what should happen next doesn't match the plot I had in mind. I'm getting better at that.

However, as a writer myself, I can say that this is just as true for human writers as well. For every final draft you see there are ten discarded ones, and a hundred that never made it to paper.

Viewed that way, GPT-3 is actually much better at the core part of writing than I am! It's more creative, it uses English better, it's better at matching the narration to the characters than I am...

It's just that this isn't enough. It's missing a full model of the world, and it doesn't know how to look at what it's written and decide if it matches its intent, or whether it'll break consistency or get in the way later.

It doesn't have an intent. It doesn't know about consistency.

But that's also true for that part of me.

GPT-3 isn't a human-level writer. What I've determined, however, is that it's a huge part of one, and it's more than good enough to fulfill the role of that part already. Now we just need the other nine tenths.

> it doesn't know how to look at what it's written and decide if it matches its intent, or whether it'll break consistency or get in the way later.

And we can build other models specifically for this. We don't need to add this stuff to GPT-3; GPT-3 can literally act as a part, a component. GPT-3 can serve the role in a larger model that "imagination" does in a human brain—being fed inputs; having corresponding outputs scavenged through by the rest of the model; and then being "fed back" with input that relates to the scavenged outputs.

One thing I'd be very curious to see tried, is to get a system consisting of GPT-3 as "writer", and some other (summarization?) model as "editor", to attempt to dramatize or adapt into prose fiction, a machine-readable sequence of events (e.g. a machinima recording of a stage-play enacted within an MMO game.)

We already have models that turn machine-readable sequences of events directly into prose; see e.g. baseball news reporting. Such models can work just as well in reverse, summarizing in-domain prose back into machine-readable facts.

So if you take such a prose-to-factual-assertions "reading comprehension" model, and feed it GPT-3's output; and then measure the distance between the set of events comprehended by the "reading comprehension" model from GPT-3's output, and the source data (which is also in the form of a set of factual assertions), then you can iterate GPT-3 — maybe even one additional line of prose at a time — to find a story that is a consistent adaptation of the source. In this sense, GPT-3 is acting as a programmer, and the "reading comprehension" model as a compiler — with the compiler reaching out and erasing any line that doesn't compile.

Of course, you're limited in this by the "reading level" of the reading-comprehension model. But this is also true of humans; you can't get out a literary classic if the writer's editor and alpha-readers were five-year-olds.

The domain is play.aidungeon.io and the GPT3 based version is only available to sponsors right now.

After seeing that the domain name didn't work I thought for a moment that your post was GPT3 output-- imaginary URLs is a good GPT2 tell--, but some research shows that there actually is a GPT3 version:

https://medium.com/@aidungeon/ai-dungeon-dragon-model-upgrad...

It's only $10 to get access.