| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by layer8 811 days ago
	This will be a fun reminiscence once we find out how humans are able to learn with just a tiny fraction of that data volume.

6 comments

mattgreenrocks 811 days ago

Despite all the hoopla around AGI, the sheer amount of data required really makes human learning all the more impressive.

Gödel probably consumed a miniscule fraction of what these systems have seen. And look what he came up with!

link

dkasper 811 days ago

Not sure this is a good conjecture. The main reasons are 1) AI’s are expected to have incredible range that the average human does not 2) humans actually do take in enormous amounts of data but it happens over the course of many years and most of it is audio/visual/tactile/experience.

We already see that if you want to focus on a narrow skillset you can use a much smaller model and training set. But right now it is a race because everyone wants to be the one true generalized intelligence model.

link

sigmoid10 811 days ago

The data volume is actually not that different once you account for all senses and how many years it takes for a human to become useful. The interesting thing would be how the human brain filters out the unimportant information as it develops.

link

llm_trw 811 days ago

That's a distinction without a difference. The majority of data is from a distribution that's already been sampled multiple times.

E.g. how often does a baby go out and experience something novel? The majority of it's time is spent getting the same stimulus over and over again, as anyone listening to childrens television can attest.

Humans learn in fundamentally different ways to our current systems and information poverty is not a problem for us.

link

sigmoid10 808 days ago

And what do you think epochs in machine learning are? Or why more modern training efforts (i.e. for LLMs) are focussing hard on deduplicating scraped data?

link

llm_trw 808 days ago

Why don't you tell me instead of asking questions that you surely know the answer for?

link

sigmoid10 807 days ago

It was rhetorical. But in case you actually don't know: what you described (i.e. multi sampling) has been common practice in ML for ages. Only now the latest models are getting so big that people are actually trying hard to move away from this idea because it would take a human lifetime in wall clock time to train a cutting edge LLM on similar datastreams.

link

gdsimoes 811 days ago

Because we actually think? I'm not just trying to guess the next word and I understand causal relationships.

link

quietbritishjim 811 days ago

The real difference with human learning is feedback: when young humans learn, at least some of the time they are interacting with intelligent agents that are able to give them focused feedback on their recent inputs and initial reactions to them.

link

nicklecompte 811 days ago

I think this ignores the most essential feedback very young humans get: planet Earth itself obeys laws of physics, mathematics, logic, etc. And by age 2 human children already have far faster and deeper reasoning abilities than any contemporary AI, even if their lack of linguistic knowledge means they wouldn't perform very well on LLM benchmarks.

In general AI researchers have done a very bad job exploring how a system might be "near-human" according to some fancy linguistic benchmark, yet dramatically dumber than a pigeon in terms of general reasoning abilities.

link

weregiraffe 811 days ago

Tiny fraction... if you ignore the learning data processed by a billion years of evolution.

link

layer8 811 days ago

It’s a good question what portion of our DNA contributes to the information processing and knowledge in our brain.

However, the first complex nervous systems came about in the Cambrian explosion, only about half a billion years ago. And we also don’t train LLMs by random mutation and selection, it’s a much more teleological process.

But to extend the analogy, we should be able to train a model continuously, and not have to start training from scratch for each new model. Although, maybe, that would require random mutations, and thus much more time?

link