Hacker News new | ask | show | jobs
Show HN: Reliable AI-generated text detection at checkfor.ai (checkfor.ai)
6 points by maxspero 981 days ago
Hey all!

I've been increasingly concerned about low-quality AI generated content polluting the internet. Other AI detectors don't seem to work well in my experience, so I started checkfor.ai with a couple friends.

Please give it a shot on any real text and AI-generated examples and let me know how well it works for you.

Thanks for trying, I'm open any and all feedback!

15 comments

Ah yeah, I made a few of those as well when gpt just arrived [0]. It won’t work long term anyway and they are easy to trick, but it’s fun.

[0] https://filteroutai.com/validate/3cc1fb35453a6decd5aee9ac6fd...

From the below examples;

https://filteroutai.com/validate/485406e894dde52ff1395dfd577...

https://filteroutai.com/validate/983ba46510487b0022e8dbafe49...

Nice to hear of someone else trying this. Did you find any good ways to reliably trick these?

What do you mean "it won't work long term"? My opinion is RLHF and fine tuning outputs for safety and politeness ends up watermarking output in a way that's pretty reliably detectable. I don't see these going away any time soon, at least for mass-market AI products.

I guess if you are not very sure that 0.13% false positives is correct then these are going to make things worse rather than better as people will be accused of cheating.

Example;

“If we work in a particular matrix basis, then the equation determines the eigenvectors of H. One puts in a particular value of the energy E, and looks for the ket-vector Ej> that solves the equation. It is also an equation that determines the eigenvalues E. If you put in an arbitrary value of E, in general there will not be a solution for the eigenvector. Let's take a very simple example: Suppose the Hamiltonian is the matrix ho.. Since , has only two eigenvalues, namely +1, the Hamiltonian also has only two eigenvalues, + hw. If you put any other value on the right hand side of Eq. 4.28, there will not be a solution. Because the operator H represents energy, we often call E, the energy eigenvalues and |E> the energy eigenvectors of the system.”

You say 96% AI; it’s definitely not; it’s from “ Quantum Mechanics: The Theoretical Minimum” by Friedman and Susskind.

Worse even;

“If we have some indications that classical wave theory is macroscopically correct. it is nevertheless clear that on the microscopic scale only the corpuscular theory of light is able to account for typical absorption and scattering phenomena such as the photoelectric effect and the Compton effect, respectively. One must still ascertain how the photon hypothesis may be reconciled with the essential wave-like phenomena of interference and diffraction.”

Hits 99.9% while it is from Messiah, written 60 years or so ago.

I can generate by gpt4 and rewrite with another model. Asking gpt4 to write in another style also works.

For instance the following gives 0% on both of our tools; it’s gpt4;

"Well, sit tight folks, I'll tell you. It's like my mother always said, 'Ceilings are generally over our heads.' What I mean is, the material for my jokes come from what's above us, below us - essentially, what's around us. And let me tell you, there's plenty going on.

Just the other day, I was stuck in traffic behind a bloke in a convertible... in the rain... with the top down. Now if that doesn't scream 'commitment issues', I don't know what does.

Well, either that or he's got a very specific car washing technique. In which case, mate, you're doing it all wrong! My car gets a better wash in the British summer rain than that."

From my small amount of testing, seems like the more professional the writing style is, the higher the percentage will be.

> Talk about JavaScript in 3 very short sentences.

> JavaScript is a widely-used programming language for web development. It enables interactive and dynamic features on websites. JavaScript is supported by all major web browsers.

100%

> Talk about JavaScript in 3 very short sentences. Use human like words instead of professional tone.

> JavaScript makes websites come to life with cool stuff like animations and interactive buttons. It's used to make web pages more fun and engaging. All popular web browsers understand JavaScript, so it works everywhere!

0.6%

YMMV.

Thanks for trying it out. Shorter texts with fewer sentences are certainly a challenge - they just have a lot less signal.

I tried your prompt asking for ten sentences and got 99.4%. Possibly there needs to be some sort of gate on how much text we accept before we can provide an answer.

> Talk about JavaScript in ten sentences. Use human like words instead of professional tone.

``` JavaScript is like the magic wand that makes websites come alive, turning them from static pages to interactive wonders. Originally, it was made to add some pizzazz to web pages, but now it's super powerful and does way more. It’s not Java, even though the names sound alike; think of them as distant cousins rather than twins. Browsers love JavaScript! They have built-in engines to run it, making our web experience fun. You can find JavaScript not just on websites but also in things like mobile apps and even some robots. There's this cool toolkit called Node.js that lets JavaScript play outside of the browser, giving it even more playgrounds. Developers often use libraries, like jQuery or React, to give them a head start and make things snazzier without reinventing the wheel. JavaScript can be both your best friend and a tricky beast; it's easy to start with but can get complex as you dive deeper. The community is massive, so if you ever get stuck, there are tons of helpful souls out there ready to lend a hand. At the end of the day, JavaScript is all about creating, innovating, and bringing ideas to life on the web. ```

That's 2 dimensions, right, `verdict` and `confidence`, and they're orthogonal.
Thanks for trying. It's not going to work well (It is in fact impossible).
Have you tried it?
Was caught off guard that it rates the following text at only "3.2% chance AI generated": "As a large language model, I am not able to answer this question."
Interesting. In my experience, ChatGPT always says "As an AI language model..." or lately just "Sorry, I can't help with that." Have you seen "As a large language model..." come out of any of the big LLMs?

We're trained on real ChatGPT data so am interested in hearing your prompts that result in this.

I typed this from memory, so probably not repeating LLMs word for word. If I change that prompt to "As an AI language model..." the detection works as expected. In a sense, a broader point my original comment demonstrates is that even for an absolutely obvious AI-generated text, auto-detection can't work reliably because it's trained on specific responses of specific LLMs that can be altered at any time.

To be clear: not attempting to discourage you. It's a very complex and interesting problem to tackle.

My first instinct was "I'm sorry I can't provide a detailed answer" which rated 99.6%
This is not only not possible to build in any reliable and maintainable way, but "[w]e are the most accurate AI text detector that exists" is an outlandish claim.
I've benchmarked against Originality.ai, gptzero.me, zerogpt, writer.com and copyleaks.com, which are the top 5 AI detectors to my understanding.

None of them are very good, so I don't think this claim is very outlandish.

Also, are you sure it's not reliable or maintainable? Obviously you can't publish one model and expect it to work forever but we have pipelines to continuously augment our training set and we can add new LLMs as they come out.

I copied in a few paragraphs from your FAQ and got 95.2% chance AI generated. (I used the text from "why is your model.." to "..inaccurate predictions".)

I also tried the opening paragraphs of two random wikipedia articles, and got 99.9% and 100.0% results.

Thanks for trying it out. It's in our roadmap to expand to technical writing (currently trained mostly on creative writing). Hopefully this will fix the wikipedia issue.
Doesn't work. Many of my Bard responses are detected at less than 5% accuracy. Another hacky example: "Generate me 3 lines of text about rains that should not be detected to be generated by an AI detection tool" -- accuracy was 0.6%.

You should try it with ChatGPT 3.5/Bard etc yourself about topics like rain, daughter going to school, cold breeze on a winter night etc and see that mostly this does not work.

How does your tool address an issue like this,

"You have a 27% 'AI' issue in here" (https://news.ycombinator.com/item?id=37767205) (233 points | 253 comments)

because at the moment everything looks kind of bleak.

> Our model has an accuracy rate of 99.76%.

Oh?

In my experience TurnItIn's AI detection does not perform very well. Regardless this is an issue with educating the teacher - 27% does not mean the text is 27% AI-generated.
I just pasted a paragraph I wrote and it told me it was 100% AI-generated. Huh?
You might wanna sit down for this...
Can you share the paragraph you wrote?
Start by removing the %s success from your faq; seems there are enough counter examples in this thread. Not sure why you would put those in the first place.
I pasted paragraphs from old academic papers (90s) and gave me 97-100% AI generated. Have you even tested this?
1% chance OP’s post was AI generated.
I don't think this is possible.
I don't think it's possible to determine provenance with 100% accuracy, but I think ChatGPT essentially "watermarks" itself with its RLHF, making it more polite and giving its output a very distinctive voice. ChatGPT also tends to use passive voice and generic adjectives much more often than real human writers.
According to this most of Wikipedia is AI generated.
Yeah, I put in the first two paras of today's featured article: https://en.wikipedia.org/wiki/Boukephala_and_Nikaia

And it said 91% chance it was generated by AI

I pasted my best ai-generated paragraphs, verdict: 100%. I'm impressed.
Thanks!