Hacker News new | ask | show | jobs
by equestria 586 days ago
I'm not sure this is a useful test. You can most certainly get an LLM to infinitely "correct" or "improve" its own output. But take the "The work uses graphs..." paragraph and plop it into an AI text detector like Quillbot. It's a long and non-generic snippet of text, and it will score 100% AI. This is not something that happens with human writing. Sometimes, you get false positives on short and generic text, sometimes you get ambiguous results... but in this case, the press release is AI.
1 comments

I have no doubt the author of the press release used LLM to help them, but I'm not convinced that this was fully generated by AI. Since you got me thinking about this more, I decided to run the sentence across my tool with a new prompt that will ask the LLM to decide. Both Claude and Llama believe there is a 55% or more chance while GPT-4o and GPT-4o-mini feel it is less than 55%.

https://app.gitsense.com/?doc=381752be7fd0&prompt=Is+AI+Gene...

Edit:

I created another prompt that tries to better analyze things and they (models) all agree that it is most likely AI (+60%). The highest was gpt-4o-mini at 83%.

https://app.gitsense.com/?doc=381752be7fd0537&prompt=Is+AI+G...