Hacker News new | ask | show | jobs
by tolerance 25 days ago
I've warmed to LLM-generated/assisted writing in general but this kind of stuff is just lazy and is basically "I got Claude to say something I agree with and then made it pretty".
6 comments

A browser plugin that scores webpage content based on how likely it is to have been AI-generated would be quite useful.

Browser vendors can't build this.

> A browser plugin that scores webpage content based on how likely it is to have been AI-generated would be quite useful.

I am strongly against this, because you cannot accurately detect it. People start to get blamed even more when they actually did not use the AI.

Nothing new under the sun unfortunately. It’s just an easy way to dismiss people you don’t want to listen to, and people abuse it like crazy.
This is virtually impossible to build. Not just because all current "AI detector" systems are fake or outright scams with accuracy comparable to a coin-flip on frontier model output, but because even if someone did build a reliable detector and released it to the public, it could be used for adversarial training and it would become worthless pretty fast.
Pangram is legit. I don't work at pangram, we integrated it in our paper website and one of the cool emergent behaviors I've seen is that on AI papers with example rollouts, it will accurately mark the paper's main text as human generated and the rollouts as AI generated.

My understanding is that they strongly believe in no false positives, so it's definitely possible to slip something by them but if it marks something as AI, it very likely is.

> My understanding is that they strongly believe in no false positives

Who cares what they "believe" (or, more accurately, say they believe). What are the underlying processes that actually guarantee this, and what data supports it?

What is a rollout in this context?
> Pangram is legit.

Their 99.98% accuracy claim[1] makes me doubt that.

[1]: https://www.pangram.com/solutions/chrome-extension

Rather obviously they're choosing the one that makes them look best. Another they link to¹ shows 98% for example.

Much more importantly, 9/10 dentists agree it's the best.

1: https://arxiv.org/pdf/2501.15654, linked from² https://www.pangram.com/blog/third-party-pangram-evals (the second section)

2: the third study they link there is based entirely around the assumption that Pangram is correct, and seems to have been a collaboration or something as they're included in the credits area.

AI is very hard to detect and changes on a weekly basis.

But you could build something that ranks the quality of the webpage content! This would also be more useful.

Of course, that tool would have to use AI...

Bot detectors are broken. Even human bot detectors are broken. When I'm in the right mood, I can be quite capable of writing with very good formatting, structure, and phrasing. When I actually take the time to do this, there seems to be about a 70% chance that some nimrod will crawl out of the woodwork just to accuse me of being a bot.

Even humans who deliberately use lazy formatting and leave obvious errors uncorrected to provide "proof" of being human aren't seeing the big picture, here.

---

That bigger picture is that it's easy to make instruct a bot to be lazy, or to avoid the usual quirks. I hate when I'm working on a project and see a constant outflow of negation ("Don't do x, y, or w" is a recent hit) and unfounded exclusive confidence ("The correct answer" as if this is Highlander and there can be only one). Repetitious jargon like overuse of "gate" for things other than fences and skiing is something I can't stand. Plus the usual things — like overuse of unusual punctuation — that are obvious tells.

That stuff all drives me nuts.

But the bot just follows instructions, and my bot has been instructed to avoid those things. It generally performs very well, though the instructions do need re-hashed every now and then as models ebb and flow.

It's super easy to get the bot to write some python or perl that takes a body of text and intentionally some words or lose a comma while mmaking other errors and converting — into --.

When it comes to human error in written language, we just aren't that hard to emulate.

Now, that all said: You'll just have to take my word for it, but I do not use the bot to help with writing English. But I do have every confidence that if I woke up tomorrow and actually started bulking up my comments using a bot, none of you would be able to tell.

Everyone has failed to build this. They can only sell claims that they have built it to fools.
I work somewhere that tries to do such detection (for fraud prevention) and it sort of feels impossible to me in the medium term. AI slop qualities are fleeting - I’ve seen Reddit AI posts that have misspelled words, no dashes, stilted sayings and so on.

People want their slop to be undetectable.

Check out Pangram
generate a story where there is not much of a story. What is unfortunate is this has gotten upvoted and is now part of the noise.
I’ve definitely learned lately to basically never trust an LLM sending back to me “other people have reported the same issue.” it means it in the most literal sense, as in it went online and found somebody who said something similar to what I am looking in to. It has no ability to determine validity, proportion, relevance, etc.
AI didn't start this, journalist have been using wordplay to "technically tell the truth" forever.
Did not strike me as AI-written. But it's useless to try to distinguish. There is only good writing and shite writing. (With things like "accuracy" and "verifiability" and even "awareness of adjacent context" included in my definition of "good.") The article is reasonably good and your comment I'm afraid is fairly shite.
I disagree. There is more content out there than I can read in many lifetimes, so I have to be selective. LLM generated text (like any text) can be well put together on the surface level but require deeper consideration to see the flaws, and of course this takes more effort than the writing did.

A human-written piece indicates someone believes in it enough to put in enough effort to write it up nicely, so it works as a heuristic of underlying quality.

All true, but how do you distinguish human-written from AI-written or a hybrid? They all have an author's name attached. You would have to limit your reading to people you know personally. (Which isn't a terrible idea actually.) Otherwise it's a judgment call, which inevitably comes down to a question of writing quality. "This has to be AI because it's so terrible." But humans are perfectly capable of writing terribly (that is in fact where the LLM learned it) and LLMs can even write well occasionally, including with human intervention. So I decided that if I'm using quality as a proxy to guess at authorship, why not just forget authorship and make quality primary. Basically since authorship is unknowable I'm declaring it irrelevant. It's not ideal but these are the times we're living in.
How is this relevant to the article?
The article was written with a premise into a prompt for Claude, which then wrote the whole thing.
What do you think about the contents?
Unless there's evidence that all of it was fact-checked, it's a waste of time to look harder. You can get any output you like, it doesn't mean it's correct.
How different is this from humans? In my experience people fill in the blanks quite the same, they just do it less convincingly and sometimes maliciously. Having the sort of prejudice you describe against AI content doesn't make sense if you consider humans make mistakes and lie all the time, coming from a position of less knowledge than LLMs. You need to approach both with similar caution.
> How different is this from humans?

Humans exercise judgement.

At least when humans lie they're usually doing it on purpose. When machines lie they don't know they're doing it.

At least part of it is that we do attribute malice or lack of care (or madness) to people who repeatedly do it, and treat their output differently in the future.

For some reason, some people repeatedly defend machines that constantly do the same thing, and claim we should give it the benefit of the doubt.

It’s interesting this AI-generated article references “Reddit threads” being “full of support” two or three times, yet I can’t find Reddit threads in the references.

I wonder if we are seeing what may be the result of a Reddit bot campaign to sway generative output.

It’s poorly written and untrustworthy. I’d rather it not exist.
I think I could prompt Claude to make me an opposite article telling me Americans love flock cameras
Why does that even matter? It's inauthentic, don't waste your time.
I value my time. I will not be reading any statistically composed slop that a human couldn’t even be bothered to spend time writing.
If the contents can be generated, why does the contents matter? They can just distill the blog down to a prompt and skip forcing us to read bullshit.