Hacker News new | ask | show | jobs
by tsimionescu 2359 days ago
There may be some kind of labeling encoded in genes. One thing that it is safe to assume is genetically encoded somehow is that sounds made by your parents/humans around you is worth repeating while other sounds are not.

However, past that, the actual sounds themselves, and any association to meaning, are pretty far from tagged data sets. Stuff like the specifics of language (e.g. that a dog is called 'dog') are definitely learned, and children learn them with typically only a handful of stimuli, often a single one.

For contrast, imagine training a model with raw sound data tagged only with "speech" vs "not speech" (and probably only a few thousand data points at that) and I will be amazed if it can recognize a single word. And babies don't just learn words, they learn their association to things they see and hear, and grammar, and abstract thought.

Do note that it is very likely that human brains can learn all that because they have some good heuristics built in. We definitely know some stuff is "hardware" - object recognition, basic mechanics, recognizing human faces and expression, and others. We are pretty sure higher level stuff is also built in - universal grammar, basic logic, some ability to simulate behavior seen/heard in other humans. This specialized hardware was also most likely learned, but over much, much greater periods of time, through evolution over hundreds of millions of years (since even extremely old animals are capable of picking out objects in the environment, approximating their speed etc).

3 comments

There seems to be a spectacular underestimation of the amount of training data humans experience.

Not only does socialised human intelligence require at least a decade of formal education, but it also spends a lot of time in a complex 3D environment which is literally hands-on.

It's true some of the meta-structures predispose certain kinds of learning - starting with 3D object constancy, mapping, simple environmental prediction, and basic language abstraction.

But that level gets you to advanced animal sentience. The rest needs a lot of training.

For example - we can recognise objects in photographs, but I strongly suspect we learn 3D object recognition first - most likely with a combination of shape/texture/physics memory and modelling - and then add 2D object recognition later, almost as a form of abstraction.

Human intelligence is tactile, physical, and 3D first, and abstracted later. So it seems strange to me to be trying to make AI start with abstractions and work backwards.

Well, babies start picking out objects within weeks or months after birth. And many birds and mammals are much faster than that. That's not a huge amount of data to learn something so abstract from scratch, especially given the limited bandwidth of our data acquisition.

Furthermore, for other kinds of human knowledge, the learning process is very rarely based on data. After the acquisition of language, we generally seem to learn much more by analogy and deduction than by purely analyzing data. The difference is evident, since we can often pick up facts with a single datapoint, even in small children in kindergarten.

Also, getting back to your point on how we start AI - if you try to take a neural network and throw 3D sensor data at it, and immediately start using its outputs to modify the environment those sensors are sensing, I suspect you will not get any meaningful amount of learning. You probably need a very complex model and set of initial weights to have any chance of learning something like 3D objects and their basic physics (weight, speed and hwo those affect their predicted position). I would at least bet that you wouldn't get anywhere near, say, kitten accuracy in one month of training.

Related to 3D objects vs 2D, I completely agree.

>> Not only does socialised human intelligence require at least a decade of formal education, but it also spends a lot of time in a complex 3D environment which is literally hands-on.

Note that for most of our history, the majority of humans did not get anything like "formal" education as we mean it today (i.e. going to school). Although adults in hunter-gatherer societies do teach children many things (e.g. which mushorooms are edible ect.) this must be done after a child has learned language -and those kids don't go to school to learn their language, they picke it up as they grow up.

> One thing that it is safe to assume is genetically encoded somehow is that sounds made by your parents/humans around you is worth repeating while other sounds are not.

I don't see how that's safe to assume at all. What one could assume is the level of familiarity and comfort (sight, smell, touch) might be somewhat genetic and gives such inputs precedence. OR it might just be that those sources of information are engaging and animated.

> Do note that it is very likely that human brains can learn all that because they have some good heuristics built in.

Nor do I see this assumption having any weight, many of the heuristics we take for granted were hard fought, its just so long ago that we've forgotten the fight. Lets not forget how "little" our species gets over the first few YEARS of child development. If your child can move their body, just about walk and talk a little at TWO WHOLE YEARS in, they're an achiever.

The encoding I was talking about may well be something more abstract than 'imitate humans'. Still, babies don't generally try to imitate the sound of rattles or household sounds nearly as much as speech, so I still conclude that it is a safe assumption that there is something about sounds made by humans that is inherently interesting to them for some reason (instead of being a learned behavior).

Related to the second, the rate at which we learn, and the very specific order we learn things in, points very strongly in the direction that there is some built-in model that we train inside of. For example, essentially all babies first learn intonation before learning words. Also, most words are learned with an extremely small set of examples - at some ages, often hearing a word a single time is enough for the child to learn it (known as the 'poverty of the stimulus' problem). This has been mainstream understanding ever since behaviorism fell out of favor due to similar arguments by Chomsky.

> try to imitate the sound of rattles or household sounds nearly as much as speech

Well surely that's a case of the range of the vocal chords? Parrots are another intelligent creature that has better range and they imitate all sorts of sounds.

> Related to the second, the rate at which we learn, and the very specific order we learn things in, points very strongly in the direction that there is some built-in model that we train inside of.

Or that an action like walking requires one to put one foot ahead of the other, all other strategies in attempting to walk end in failure, which is why we don't see them.

I'd like to point out that all humans perceive intonation and its perceivable outside of language, that's why its easy to pick up, you don't need language to realise that someone is cross, or happy or sad. However considering autistic children cannot then maybe there are some genetic markers at play there at least.

>> Well surely that's a case of the range of the vocal chords? Parrots are another intelligent creature that has better range and they imitate all sorts of sounds.

Parrots (and birds like mainas etc) immitate human sounds and all sorts of sounds, but they don't discriminate between, e.g., the sound made by a train whistle and the sound made by a human carer. I mean that a parrot will not learn to speak a human language by immitating its sounds, any more than it'll learn to speak train by immitating a train whistle.

Human babies don't just immitate their parents' sounds, they figure out what those sounds do and how they come together to form language and express meaning. That is a small miracle that we don't understand at all well and Chomsky is 100% right to speak of scientific wonderment, in its context. It is really mind-blowing that kids can eventually learn to speak without, for the vast majority of children, anyone around them having any idea how to teach a kid to speak in any systematic way. Not to mention the trouble that adults have in learning another language even given formal training in it (which perhaps is further evidence that we really don't know how to teach language, because we don't understand how it works, so again, how can we teach small children to speak a language, but not adults?).

Chomsky's universal grammar is really the simplest answer: children don't learn how to speak a human language, they already know how, and they only have to learn the vocabulary and syntax of the language of their parents. This only presuposes that humans have human biology, and that our biology is responsible for our language ability. We can't learn to fly because we don't have wings and parrots can't learn to speak because they don't have human brains.

[Edit: that it's the simplest answer doesn't mean it's the right answer, only that it's got a damn good chance to be it.]

The range of the vocal chords is a reason why children can't successfully imitate these sounds, it doesn't directly explain why they wouldn't try.
maybe they do try and we just shrug it off as gurgling. Kids do make funny noises when they're vocalising.
> One thing that it is safe to assume is genetically encoded somehow is that sounds made by your parents/humans around you is worth repeating while other sounds are not.

Well, these sounds come with a face attached and we know babies are hardwired to pay attention to faces.

That may well be the mechanism behind this. I was talking in very general terms, not a specific 'imitate humans' structure in the brain.