| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by eirikbakke 328 days ago
	Humans require a _lot_ less training data to become, for instance, fluent in English. If a given AI algorithm needs to be trained on the entire Internet to accomplish the same, then it seems safe to assume that the data has not really been "mined out". Generating more training data from the same original data should not be fundamentally problematic in that sense.

2 comments

felipeerias 328 days ago

It only seems that way because much of the data that humans use is not in a format that computers would understand. A toddler learning to talk is engaging their full body.

link

ausbah 328 days ago

humans also have billions of years of evolution and trillions of organisms to develop a receptacle biased towards learning language

link

eirikbakke 328 days ago

Billions of years of evolution, but still limited to the data that is replicated in human genome/DNA, which is about 3 gigabytes (+epigenome).

link