| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by chmod775 2 hours ago

There's a post every other month where some dude who put nonsense information online celebrates because it actually ended up in some frontier models weights.

If it's easy enough that some randos can do it for fun, what do you think happens when there's commercial interest behind it?

Obviously companies are going try nudging AI towards recommending whatever they're selling. It's a logical extension of SEO - and that's a 100 billion USD industry.

Additionally, if I believed myself to be in some sort of spending - err - AI race, I'd try to poison the data sets of my competitors by putting crap out there for others to ingest.

3 comments

aspenmartin 1 hour ago

It's not really a problem. We're out of natural tokens anyway. The future is synthetic verifiable traces (already the way we train coding agents).

link

maxnevermind 32 minutes ago

> synthetic verifiable traces

What does it mean, Is it like when somebody used some coding agent to develop a feature and later input prompts and a resulting PR can be used for training by a presumption that final PR was a correct implementation of a prompt?

link

jurgenaut23 1 hour ago

Do you have examples of such celebrations?

link

Shitty-kitty 45 minutes ago

They already are, It has become a real problem in Reddit. Especially with the latest in pseudo-science crap like peptides.

link