Hacker News new | ask | show | jobs
by unoti 1292 days ago
> It doesn't create anything new. It creates things that look new.

This is not technically true. It can and does create things that are new. There are lots of new poems and jokes right here in this thread. I asked it, for example, to give me its top 10 reasons why Bigfoot knocks on camper trailers, and one of its answers was "because it likes to play with its food." I did a lot of searching to try to find this joke out there on the internet, and could not. I've also had it create Weird Al style songs for a variety of things, and it does great.

If these aren't new creations, I'm not sure what your threshold is for creating something new. In a sense I can see how you can say that it only "looks" new, but surely the essays generated by students worldwide mostly only "look" new, too...

2 comments

ChatGPT has create a poem to cheer up my sick girlfriend. I have written a bit how she feels, what she has (just the flu) and what I did to cheer her up. ChatGPT created a decent poem with exactly fitted my description but was a bit dramatic, she's not dying just tired of being sick. I have asked ChatGPT to create a less dramatic version that rhymes more and ChatGPT just did it. Amazing. I have also googled parts of it but didn't find them! This certainly counts as novel or I would also be totally unable to create novel poems about my sick girlfriend (because I have read poems about girlfriends before?!).

A good idea when dismissing those machine learning models is to check whether a human would pass your standards. I miss the aspect when the dismissive "they only interpolate or memorise" arguments come. I am also quite bounded by my knowledge or what I have seen. Describe something I have never seen to me and ask me to draw it, I would fail in a quite hilarious way.

Hilariously, ChatGPT is also quite bad at arithmetic, like myself. I thought this is what machines are supposed to be good at!

People solve this by getting the GPT to describe a series of computations and then running those steps externally (e.g. asking GPT what Python code to run).

Thats not so different from how humans do this. When we need to add or multiply we switch from freeform thought to executing the Maths programs that were uploaded into our brains at school.

If I recall correctly, in his paper on whether machines could think, Turing gives an imaginary dialogue with a computer trying to pass as a human (what we later came to call the Turing test) where the judge poses an arithmetic problem, and the computer replies after a pause of 30 seconds — with the wrong answer.
That joke is a great example of why the creativity is surprising.

A human might have a thought process that starts with the idea that people are food for Bigfoot, and then connects that to phrase of "playing with your food".

But GPT generates responses word by word. And it operates at a word (token) level, rather than thinking about the concepts abstractly. So it starts with "Because it likes to play" which is a predictable continuation that could end in many different ways. But it then delivers the punchline of "with its food".

Was it just a lucky coincidence that it found an ending to the sentence that paid off so well? Or is the model so sophisticated that it can suggest word "plays" because it can predict the punchline related to "food".

I think what you are saying is just not true in the sense GPT style LLMs. The output is not just single word generation at a time. It is indeed taking into account the entire structure, preceding structures, and to a certain extent abstractions inherent to the structure throughout the model. Just because it tokenizes input doesn't mean it is seeing things word by word or outputting word by word. Transformers are not just fancy LSTMs. The whole point of transformers is it takes the input in parallel, where RNNs are sequential.
It seems I'd gotten the wrong impression of how it works. Do you have any recommendations for primers on GPT and similar systems? Most content seems to be either surface level or technical and opaque.
No. You got the right impression. It is indeed doing "next token prediction" in an autoregressive way, over and over again.

The best source would be the GPT-3 paper itself: https://paperswithcode.com/method/gpt-3