First, the reason LLMs learned to like em dashes is that they are common in the training corpus - they are a thing before LLMs that LLMs have learned, not invented?
Second, work browser has nice blue swiggles under everything I write into a textbox. I dutifully click through them and accept the rephrasing suggestions. I get a lot of em dashes. My blog posts and whitepapers and stuff are full of them and other “AI tells” - but I think they read better because of it.
I use emdashes all the time. They're correct punctuation as opposed to a minus sign. They're easy to type too: opt-shift-minus. If they were such a huge giveaway without ever being used by humans, models would be trained by now not to use them as much.
I've never seen writing created before the advent of LLMs that used emdashes in the same way and with the same frequency that LLMs regularly do. There's probably some out there but it would be a real outlier. LLMs overuse them to an absurd degree, putting them where most writers would put commas, occasionally semi-colons, or nothing at all.
I count 51 em-dashes on the page, which is extreme. They're also used in places where they don't really belong. It's very obviously LLM-generated, at least in part.
That said, it puzzles me why people don't prompt LLMs to change up the writing style a bit and remove some of the tells.
First, the reason LLMs learned to like em dashes is that they are common in the training corpus - they are a thing before LLMs that LLMs have learned, not invented?
Second, work browser has nice blue swiggles under everything I write into a textbox. I dutifully click through them and accept the rephrasing suggestions. I get a lot of em dashes. My blog posts and whitepapers and stuff are full of them and other “AI tells” - but I think they read better because of it.