Hacker News new | ask | show | jobs
by ectopod 1294 days ago
Thirteen years ago I met a traveller who paid their way with travel writing, which was basically blog spam. They soon ran out of authentic material so they started writing about places they'd never been using some light googling for inspiration. For a long time now people have been making advertising money by creating bullshit on a large scale. How are you going to prove that any content is organic?
1 comments

you ultimately can't, and there are certainly degrees of "organicness" even among organic content - a lot of content is essentially infomericals or arguments shilling a particular perspective they have a financial interest in shilling. And of course there's the case like the wikipedia editor who completely made up like 75% of the scottish wikipedia articles that have been the training inputs for language translation models etc, that is very organic content but it also is actually poison to train on!

The good news is the internet is relatively good at routing around the shit, for now. And I guess de-facto that is something you could apply to your content inputs: what's the pagerank for this content? actual pagerank, not the advertising/engagement bullshit that the search model has turned into. If the AI generated stuff is correct enough that it has a high pagerank, maybe it's correct enough to be used as an input.

but the thing is honestly there's already been an uptick in ML or AI-generated content that is already surfacing in searches and other places and it's not always correct... and honestly the relevance of google's search results has been noticeably decaying for 10+ years now. Things I know are out there and are relevant are not being surfaced anymore. Is AI generation contributing to that problem? Maybe. Probably not helping, at least.