Hacker News new | ask | show | jobs
by ben_w 521 days ago
> This misses that evolution has been pre-training the human cognitive architecture - brain, limbic system, sympathetic and parasympathetic nervous systems, coevolved viral and bacterial ecosystems - for millions of years. We're not a tabula rasa training at birth to perfectly fit whatever set of training data we're presented. Far from it. Human learning is more akin to RAG

Yes, but.

The human genome isn't that big (3.1 gigabases), and most of that is shared with other species that aren't anything like as intelligent — it's full of stuff that keeps us physically alive, lets us digest milk as adults, darkens our skin when exposed to too much UV so we don't get cancer, gives us (usually) four limbs with (usually) five digits that have keratin plates on their tips, etc.

That pre-training likely gives us innate knowledge of smiles and laughter, of the value judgment that pain is bad and that friendship is good, and (I suspect from my armchair) enough* of a concept of gender that when we hit puberty we're not all bisexual by default.

Also, there's nothing stopping someone from donating their genome to be used as a pre-training system, if we could decode the genome well enough to map out pre-training like that.

* which may be some proxy for it, e.g. "arousal = ((smell exogenous sex hormone) and (exogenous hormone xor endogenous hormone))", which then gets used to train the rest of our brains for specific interests — evolution is full of hack jobs like that

1 comments

You've missed the fact that sequencing our genome isn't gathering all the information required. To duplicate a human in computational space - say to create some accelerated AI simulation, you'd need to sequence a complete Telomere-to-Telomere genome (something achieved for the first time only last year!), complete Centromere sequencing (not yet achieved). You'd also need to 'sequence' or somehow encode the epigenome - DNA methylation, histone modifications, and other epigenetic markers. Then you'd need to do the same for both mitochondrial DNA and the human microbiome - every functional bacteria and virus we host (quite the task given how little we understand this ecosystem and its interactions with our own behaviour). Then you'd need to combine genome sequencing with transcriptomics (RNA sequencing), proteomics (proteins), and metabolomics to get a holistic view of human biology.

To make this data 'actionable' for a synthetic intelligence you'd need to functionally replicate the contributions of the intrauterine environment to development, and lastly simulate the social and physical environment. This can't be 'decoded' in the way you implicitly suggest - since it's decompression is computationally irreducible. These are dynamic processes that need to be undergone in order to create the fully developed individual.

[1] https://www.bbc.com/future/article/20230210-the-man-whose-ge...

And most of that is then stuff you can throw away because it's not pre-training your brain; and the stuff that does, while we don't know the full mechanism, we know it works through the laws of physics.

Knowing the weights without knowing the full graph of the model they're used in, just the endpoints.

There's a lot of valid stuff in what you say, I am aware I'm glossing over a lot of challenges to get a copy of a human — to what extent is e.g. the microbiome even contributing to our intelligence, vs. being several hundred different parasites that share a lot of DNA with each other and which happen to accidentally also sometimes give us useful extras? It's hard work telling which is which — but my claim is that the nature and scope of such work itself still allows us to say, as per one of the parent comments:

> I think it's important to remember that we know neural networks can be trained to a very useful state from scratch for 24 GJ: This is 25 W for 30 years (or 7000 kWh, or a good half ton of diesel fuel), which is what a human brain consumes until adulthood.

If this were a 100m sprint, then I would agree with you essentially saying that we don't even know which country the starting blocks are in, but I am still saying that despite that we know the destination can be reached from the starting blocks in 10 seconds.