Hacker News new | ask | show | jobs
by tempestn 1204 days ago
The "it's been done before" one is pretty relevant. It means the model didn't actually generate the game, but likely pulled it more or less straight out of its training data. It's still very cool that you can ask it for something and it can basically mine the entire (2021) internet for it, but it's not the same as being able to create something really new.

I've noticed the same thing testing it on various coding questions. It's extremely good at problems that have solutions online. And given stackoverflow, that's a lot of problems. If you manage to hit it with something that it hasn't seen before though, even if it's conceptually very straightforward, it tends to just generate a mix of boilerplate and nonsense.

5 comments

Exactly. When the first news came out about it's ability to "understand" code, find bugs and improve uppon it, I tested it with some snippets of mine. It just gave boilerplate best practices you find on 100 of blogs, but was not able to make meaningful contribution. It claimed to have introduced a feature while only having found another way to write the same snippet. On other things it straight up invented variables & functions that didn't exist.

As long as the task is in it's training set, it can give you a decent answer, but it can't code it just mimics doing so...

ChatGPT would be so amazing as a pair programmer if it didn't invent functions.

It's perfect for what is a python function for doing X. But it's honestly 50/50 whether that function even exists.

There are tons of examples from the training set that are awful. Most people will just eat them.
>The "it's been done before" one is pretty relevant.

But is it? 99.999% of software development has been done before. Even if you do something that is legitimately new (like creating a chatbot that can generate code on demand). Then your solution will still contain more than 99% code that is just a repeat of things that have already been done.

That's not my experience at all. Copilot consistently creates implementations that are very specific to my app and manages to understand the context and problem surface spanning many files. It's not just getting a standard problem and pulls an answer from Stack Overflow.
Given Copilot's specialization for this task, I can imagine it being better at extrapolating from your own code. I haven't used it myself yet, so can't speak directly to its effectiveness, but I would imagine it would be good at automating much of the drudge work of coding, but similar to ChatGPT as far as coming up with novel solutions to problems. Which again, isn't to say it's not potentially a very useful tool!
Even if the rules were inspired by some text that's on the internet rather than a genuine invention (we'll never actually know, we're all just speculating): it hasn't "pulled it out of its training data".

To be asked in plain, simple (ish) English to invent a game, produce code for it and then style it etc and the few other bits the author asked for _is_ impressive.

Why are we asking for so much? Remember the chatbots of the mid-2000s? Eliza etc? They were impressive for the time but GPT represents a _huge_ improvement in this stuff. Of course it's not perfect, but it's an exhilarating jump in capabilities.

I definitely don't disagree that the progress has been incredible, and that GPT shows massive potential.
One could even argue it's a fancy UI that steals content from stackoverflow.