Hacker News new | ask | show | jobs
by lahwran 3811 days ago
No, there are more dimensions than just data efficiency; what I would say is that humans are just about as data efficient as you could possibly hope for _at that wattage_. It's easy to do better than evolution at something it wasn't trying to do, I agree - and this is the point where, as I thought about it, I realized I do actually think there's some concern. But I don't think we'll be able to do it on one gpu, because:

- human evolution has spent quite a while in an adverserial environment - the smarter you are, the more you win - a recent finding of the neural network research is that local minima are kind of not a problem in very highly dimensional spaces, as long as you have a problem that is smooth and has optima. If it has any optima, then in very high dimensions, there's probably always something you can change that will keep you moving towards the optimum. while evolution may have gotten stuck in a general class of architectures - neural ones - it seems very much like there are many dimensions along which it can change the brain, and that changing the genes for it slightly will change its performance slightly (fsvo slight). - evolution, in species that have learning systems in the first place, optimizes for intelligence per watt. energy is very costly in the wild, and so finding algorithms that work well with low power is very important. It so happens that algorithms that minimize power usage are theoretically tied to algorithms that compress well, but the key thing is that evolution has had a crapload of optimization time for tuning the brains of mammals in general, and then humans got in this runaway optimization process - which seems to have made us smarter primarily by making our brains use more power for the relevant parts.

I definitely think you could do better, I just don't think you're going to do it with a paradigm that looks vaguely like the brain, because if it looks vaguely like the brain, evolution probably passed it up on the way to the general architecture that mammals use, and the specific one humans use. Possible exception for gradient descent and weight sharing, because those would be difficult to implement in the brain, but that doesn't give you results hundreds of times better, and it's not even clear the brain doesn't do that - hinton has made the argument that it could.

the key thing here: if we make an agi with neural networks (which at this point is almost a for-sure thing), then going beyond human level on one gpu will be a very difficult research task, and take it a lot of learning to figure out how to do. Which means we'll get a chance to control it using less formal mechanisms than miri demands out of their work.

(I don't think miri's stuff will be done in time to be useful to anyone.)

2 comments

Thanks for the reply!

I'm also curious in your answer to Eliezer about why you assume 1 GPU.

But a few other questions:

1. You say: "human evolution has spent quite a while in an adverserial environment - the smarter you are, the more you win". Again, maybe I'm missing something in modern evolutionary thinking, but why that assumption? I always thought the consensus was that we were as minimally smart as required.

2. You seem to be under the assumption that today's most popular algo's (namely neural networks) are definitely for sure the thing that's going to become an AGI, but why that assumption? The more broad idea of some algorithm/method bringing us an AGI is more probable than specifically neural networks.

You also write: "I definitely think you could do better, I just don't think you're going to do it with a paradigm that looks vaguely like the brain". Again, why the assumption that whatever will be built will have to even resemble the human brain?

Why on Earth would you assume that MIRI assumes one GPU rather than 10,000 GPUs?