Hacker News new | ask | show | jobs
by tgv 22 days ago
Because their custom training data contains an emphasis on such verbiage. It doesn't come from the God-knows-how-many TB of web content the model is pre-trained on. There, such phrasing is only a drop in the sea. But the "yes, you're right" phrases, the em dash, etc., come from the later stage, for which content is created according to some (probably overprecise) guidelines.
2 comments

Right. The overuse of "genuinely" most of all. Seems like they put Claude through a few good rounds of training to always answer questions about its consciousness, thoughts, etc., with something about how it's "genuinely unsure," and as a result, the model learned to use "genuinely" as an intensifier in all sorts of inappropriate contexts.
Oi, I personally use adverbs everywhere. Genuinely, kids these days.
It's a very specific style of condescending journalism that US media has been nurturing and recycling for decades now. I was going to write this this whole comment as a parody of it, starting with some literary hook like 'Call it Ouroboros syndrome:' but I can't bring myself to add to the pile.

I have not done the textual and statistical analysis to verify this, but I feel like it's something you could trace back to east coast journalism schools and publishers mediated via television, which long predates mass adoption of AI. Think how many news articles you've read with titles like 'Anatomy of a murder' os 'Inside the meeting that changed everything.' The hooky, slightly pompous tone is something you can find back as far as the 1960s or 1970s; browsing through old issues of Readers Digest and you'll find tons of it. When I say it's mediated through television, I'm talking about both the dramatic and heavily conclusory style of fictional prosecutors and narrators, and the extremely shallow style of TV news reports (often transcribed to the web) which are only one or two sentences per paragraph. And this is before we consider the stylistic impact of ad copywriting on communication in general.

And there's something else.

The one sentence paragraph interjection, designed to refocus your attention in a surprising new direction after two paragraphs of stuff you already know. 'I never thought I'd end upere,' said Sally Nocontext, hooking you in for another paragraph or two where you try to figure out who this woman is, where she ended up, and what it has to do with the article you are already halfway through reading. After all, I've come this far, the reader through. I might as well see it through to the end.

And that's just what publishers wanted.

One sentence can also validate a truism that the reader already suspects, flattering their beliefs in their own analytical powers....

...well you get the idea. When I'm using LLMs for any sort of extended session, I find myself reaching for the same few prompts to break it of such clicheed expression; I'm especially averse to the habit of adding zippy-sounding nicknames to complex or potentially dull concepts. I don't have a favorite starting prompt, but I generally find that asking for 'a concise, academic tone' does wonders to de-fluff its output. Remember, it defaults toward being as widely accessible as possible, and much journalism is aimed at consumers with only a high school education and maybe middle-school reading comprehension, math ability, and appetite for depth over sensation.

How do I save a comment; this very context has been under consideration for a while.