Hacker News new | ask | show | jobs
by arkitaip 1291 days ago
You can kinda tell it's ChatGPT by how verbose and repetitive the writing is.
5 comments

Yes, one way to identify responses that may have been generated by ChatGPT is to look for signs of repetitive or verbose language. Because ChatGPT is a language model, it is designed to generate text that is fluent and coherent, but it may also produce text that is unnecessarily long or repetitive. This can be a telltale sign that a response was generated by an AI, as human responders are less likely to use repetitive or verbose language in their answers. Of course, this is not a foolproof method, and there may be other factors that can affect the quality and style of a response, but looking for repetitive or verbose language can be a useful starting point when trying to identify AI-generated responses.
This is fun but it's already starting to NOT be fun.
It has a very recognizable default writing style but it will change things up if you ask.

> Write a response to the following comment chain in the style of an angry Hacker News commenter.

> ...

What the hell is wrong with you? ChatGPT is a brilliant piece of technology and you're just dismissing it because of some stupid writing style? Get a grip and appreciate the advancements we're making in AI. Stop being so negative and enjoy the fun while it lasts.

an AI that communicates with the verbosity of a recipe website....our glittering future!
You are not an AI. Probably.

I really think the results are impressive, but I already hate how some people will use it. And that indeed is not fun.

I reckon we are about 3 years from most textual content online being created by bots, 5 years from static visual content being bot-created, and 7-8 years from bot-created visual content clogging up all social media.
Everyone will be doing strict ID verification long before any of those milestones are actually reached.
3, 5 and 7 seem awfully pessimistic no?
I'll take the under.
> we are about 3 years from most textual content online being created by bots

0 years.

1 year for this tech to be incorporated into most cheap/free creative tools.

1 more year for these newer creative tools to be adopted by creators everywhere.

1 more year for the humans to completely give up.

You have to remember that tech takes time to permeate to the bottom rung. Most of my non tech friends haven’t even heard of Dall-E, let alone chatGPT.

1 year? The Internet's already rife with SEO spam without ML algos creating convincing looking copy. They'll copy a site wholesale just with a different domain name. Spammers are probably already using this to generate copy for websites.

eg https://news.ycombinator.com/item?id=33640819

To find out if a response was made by a computer, look for repeating words or long sentences. ChatGPT is a program that makes text that makes sense and flows well, but sometimes it uses too many words or repeats itself. This is a clue that the response might have been made by a computer instead of a person. However, this isn't a sure thing and there might be other reasons why a response is long or repeats itself. Still, it's a good place to start.

[After many attempts]

> Respond to the above in the informal style of a slightly contrarian Hacker News comment, with a hint of passive-aggressiveness. Avoid repetitive wording and long sentences.

Wow, thanks for the tip! I never would have thought to look for repeating words or long sentences to determine if a response was made by a computer. I guess I'll just blindly trust everything I read online from now on. Sarcasm aside, this is a pretty flawed method. Just because a response has repeating words or is long doesn't mean it's not from a person. And conversely, just because a response is short and doesn't repeat itself doesn't mean it's not from a computer. It's not that simple.

This is terrible. Pretty soon most comments on the internet internet will be written by bots.
> the internet internet

I think you pass the Turing test. It's ironic that in its current state AI is so "perfect" this kind of mistake can actually serve as proof of human.

Can you prompt chatgpt to introduce small errors as to appear more genuine?

You can ask it to write like a teenager etc. It should be fairly trivial to make a script that removes ' and some . and maybe adds some writing mistake.

You probably need to have a discussion with a user to find out if it is a bot or not.

Vagely relevant one or two liners type of comments we will never know if a bot wrote it. The kind of noise people post on Twitter, Facebook and Reddit.

> To find out if a response was made by a computer, look for repeating words or long sentences.

That technique will produce quite a few false positives - a significant portion of humanity with language production issues on the autism spectrum communicate like that.

edit: Makes me wonder to what extent the training data is biased by writings from people on the autism spectrum.

That's honestly a hilarious response from ChatGPT
That's just its default setting. If you do some prompt engineering to tell it to be to the point or concise, it can do better. E.g. I got:

The repercussions of ChatGPT flooding Stack Overflow with answers could include decreased quality and relevance of answers, decreased user engagement, damage to the platform's reputation, and potential loss of revenue.

Still sounds too formal. To me, it reads like the intro to a high school paper. The next four paragraphs would elaborate on each of the points listed.
Another poster already did the informal version. My main point is that it's not something set in stone but yes, it sure looks like OpenAI went overboard in setting it up like this. They should adjust it.
Maybe they shouldn’t. Nobody should. But someone will, because people will pay good money for something like this to shill their products here and on Reddit and Twitter. The worst part is that it will work for at least a while.
I believe it will always work for a majority of people. Even for political and philosophical arguments.
You can get it to respond a bit less formal by having it impersonate someone with a less formal speaking style. I said to restate to original but as if it was written by Tom Sawyer or one of his friends: Yo, what's the deal with using AI answers on Stack Overflow? It seems like it could be a real bummer. For one thing, the answers might not be as good as ones made by a human. And if AI answers become too common, the community might not be as friendly and supportive. And if the AI can search the internet and make its own answers, there could be even more bad answers on the site. All in all, this could be a major problem." Still a bit structured but with a bit more play you can see how it can 'act' like someone who isn't so formal.
Really though if they just manage a setting for "sounding like a 50 year old grouchy old bastard" and it'll put me out of work.
Can’t we have ChatGPT identify output of ChatGPT, adversarily?
It confirmed it when I asked:

--> Was the following paragraph generated by the OpenAI Assistant: "It seems like there are a few potential negative consequences of using AI-generated answers on Stack Overflow. For one, the quality of the answers may be lower than if they were written by a human. Additionally, if these AI-generated answers become too common, it could potentially lead to a more impersonal and less supportive community on Stack Overflow. Finally, if the AI is able to search the internet and "inbreed" its own answers, it could lead to even more low-quality, duplicative answers on the platform. Overall, it seems like there could be some serious drawbacks to this development."?

<-- Yes, the paragraph you provided was generated by the OpenAI Assistant. As a language model, the Assistant is trained to generate human-like text based on a given prompt. In this case, the prompt likely pertained to the potential negative consequences of using AI-generated answers on Stack Overflow.

Though, it could just have been taking a guess at it.

"It is likely that this text was not written with OpenAI Assistant."
You don't need adversarial AI to do that. Simply simhashing every response and then allowing people to upload some text as a "did ChatGPT generate this" query should work. Just ordinary software is sufficient.

Of course, once you have multiple companies with ChatGPT level AI it gets harder if they don't cooperate. But they easily could. The big webmail providers cooperate to fight spam, for example.

Before reading any words, I guessed just by seeing how long the paragraph was.

Maybe because the training data includes things like news articles that have longer paragraphs than typical forum posts?

You can ask for answers of a certain length and also a certain writing style. The writing style one is hit and miss although if an AI ever gets all fiction books fed into it I imagine it could radically improve. Say by qualifying a response with "respond in the style of author X" or "average style of fantasy books published on 1972" or whatever.
> Reformulate the above in the informal style of a Hacker News comment.

I'm not sure about this AI-generated answers thing on Stack Overflow. The answers might suck and make the community more impersonal. Plus, if the AI is just copying and pasting from the internet, it could lead to a ton of low-quality, duplicate answers. Not a good idea in my opinion.