| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sebosp 975 days ago
	Is there an approximation/ratio in which the amount of digital garbage/hallucinations online generated by AI is so big that it cannot be used to train AI itself? Like are AI companies running against the clock because, say, in 5 years the internet will be flooded by false information to such an extent that it would render the internet as an invalid training ground. In a way requiring a snapshot of the internet pre-AI, because this is click bait problem times infinity it feels like

2 comments

__loam 975 days ago

It's too late already if you want to just scrape random horseshit on the internet. There will be real money in large expert generated data sets. AI is also a potential epistemology nightmare. It can cement bad knowledge and bury new more up to date knowledge in a sea of bullshit.

link

ethbr1 975 days ago

Aka "t-minus how many days until OpenAi wants to buy archive.org"

link