Hacker News new | ask | show | jobs
by yahoozoo 293 days ago
It says in the first paragraph it’s for crawlers and bots. How many humans are inspecting the headers of every page they casually browse? An immediate problem that could potentially be addressed by this is the “AI training on AI content” loop.
3 comments

How many of the makers of these trash SEO sites are going to voluntarily identify their content as AI generated?
Moreover, I find it ironic that website owners will gracefully give AI companies the power to identify what is "good" data and what is not. I mean, why would I do the work for them and identify my data as AI, so that they would ignore it ? "yes please, take all my work, this is quality content, train on it, it's free !" that's what it sounds like
It would still be required for the content producer (ie, the content-spam-farm) to label their content as such.

The current approach is that the content served is the same for humans and agents (ie, a site serves consistent content regardless of the client), so who a specific header is "meant for" is a moot point here.

I believe this is why Google did SynthID https://deepmind.google/science/synthid/