Hacker News new | ask | show | jobs
by mynameisvlad 1205 days ago
> The discussion was probability of ChatGPT having invented it, the probability that description for such a game is in ChatGPT's dataset is extremely high.

If this were the case, it would have been trivial for you to find a game with its written rules described and which match the one generated.

You have done nothing but say that is the case. You haven’t actually proven that’s the case.

ChatGPT can’t magically infer the rules of the game from screenshots, and you have only shown that similar games exist and have existed for centuries. But that is not the same as saying that this specific game has and that ChatGPT just pulled it out of its dataset.

That is the extraordinary claim that you don’t have evidence for but are acting like it’s right there obviously out in the open for everyone to see.

1 comments

> If this were the case, it would have been trivial for you to find a game with its written rules described and which match the one generated.

Search engines doesn't work like that. You are basically asking me the equivalent of proving that a photo isn't depicting a ghost. No, I can't prove that, I can however come up with examples showing how the photo could have been created even if it wasn't a ghost.

If you want to prove that ghosts are real you need plenty of photos from lots of angles and situations, or videos, and from many sources to show that it isn't all made up by a single person. The equivalent of that would be if they had made ChatGPT generate 100 different working games for example, that would be much more believable. But a single case of a game that already exists and has countless texts describing similar games? It just looks like random chance that got handpicked or plagiarism.

This isn't a court trial, I am not going to sue ChatGPT for plagiarism here, it is just a discussion whether it is reasonable to believe ChatGPT can generate novel puzzle games.

Edit: But do note that since ChatGPT can find such ideas that are hard to find with a search engine, that makes ChatGPT very useful in a way search engines aren't. So I am not saying it doesn't add value. Just that people seem to say ChatGPT does a lot of thing that it doesn't seem to be able to do.

Edit again:

> That is the extraordinary claim that you don’t have evidence for but are acting like it’s right there obviously out in the open for everyone to see.

Yes, you think it is obvious that ChatGPT is capable of very creative and productive thinking. But most people don't think that, to them that is an extraordinary claim. I'm not here to convince you, I'm here to explain to you why you aren't convincing anyone with what you say. People like you were convinced by articles like this before the discussion even began.

> Search engines doesn't work like that. You are basically asking me the equivalent of proving that a photo isn't depicting a ghost. No, I can't prove that, I can however come up with examples showing how the photo could have been created even if it wasn't a ghost.

The claim was that it pulled the game out of its dataset. If this were the case, I would argue it would absolutely be trivial to find them. It’s not some concept that can’t be described in words or would be hard to quantify. The rules have been provided, and, assuming they were plagiarized from somewhere else, would be listed verbatim or close to it.

If a student plagiarized on their work, whether in written form or in code, it’s been trivially easy to find the exact work that was copied from. It generally takes me a few seconds of searching to find it.

This is the same. If these rules existed in a dataset, then it should be equally easy to pull them up and prove the plagiarism. If all you can find is similar puzzles, you can’t just throw your hands up and say “yep, gottem”. That’s just not how this works.

> The claim was that it pulled the game out of its dataset. If this were the case, I would argue it would absolutely be trivial to find them. It’s not some concept that can’t be described in words or would be hard to quantify. The rules have been provided, and, assuming they were plagiarized from somewhere else, would be listed verbatim or close to it.

ChatGPT uses word vectors, it wont use the same words but variants of the words. You can't search for that. Cases where word vectors only maps to single words with no variations for every word are very rare, so ChatGPT is very good at plagiarising things without reproducing exactly, it just rarely fails at it.

> If a student plagiarized on their work, whether in written form or in code, it’s been trivially easy to find the exact work that was copied from. It generally takes me a few seconds of searching to find it.

No it isn't, they just change the words and rewrites it until it no longer looks the same. ChatGPT is trained to rewrite texts like that to avoid triggering trivial plagiarism detectors. They train it to produce the same text, but with different words, producing exactly the same text is punished.

> No it isn't, they just change the words and rewrites it until it no longer looks the same. ChatGPT is trained to rewrite texts like that to avoid triggering trivial plagiarism detectors. They train it to produce the same text, but with different words, producing exactly the same text is punished.

Do you think students plagiarizing don’t do the exact same thing? Clearly someone has never actually dealt with plagiarized work. This is plagiarizing 101. The structure remains the same even if they use synonyms. Considering it’s trivially easy to find in code which is magnitudes harder to pull off, I would still argue it should be easy as pie to find this supposed set of rules.

Your point is not very credible without proof of this game existing and ChatGPT pulling it from this source. Without showing this supposed proto-game having existed with rules the ChatGPT can pull from, then all you’ve done is wave your hands around and yelled “similar games exist so this can’t possibly be uniquely generated” and that’s not a very compelling argument.

> Do you think students plagiarizing don’t do the exact same thing? Clearly someone has never actually dealt with plagiarized work. This is plagiarizing 101. The structure remains the same even if they use synonyms.

You rewrite the structure of the text, you don't just use synonyms. ChatGPT is capable of rewriting text to a different structure while keeping the meaning, I hope you are aware of that.

Anyway, even if you just change the words to synonyms it wont be easy to find in a search engine. Search engines aren't very good at finding matches to synonyms. Google tries, but in doing so they fail to find more specific texts like scientific publications or documentation, so no search engines aren't good at finding plagiarism.

Edit: And you make it sound like most plagiarism is found. No, that isn't the case, most plagiarism is not found out because it is a very hard problem to solve. Only the most blatant cases are caught. For humans that is reasonable, for AI we can be stricter since there isn't a humans career at stake.

> Anyway, even if you just change the words to synonyms it wont be easy to find in a search engine.

Got it, so you’ve never actually dealt with plagiarized work. You should have just led with that.

I have literally said, from actual experience, that this is the case. But I guess discarding that and pretending it was never said and that the opposite is true is I’m sure an easier position to hold.