| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by maizeq 943 days ago

I’ll bite. I’ll stick to heuristics and intuitions since much of the heavy lifting to better quantify these risks has and continues to be handled by others already, as the other comments have mentioned.

Here are some seemingly obvious intuitions for me, which all together add up to the obviousness of the risk here.

(1) Current ML models already exhibit all the hallmarks of successful unsupervised intelligence, they simply need to be scaled up and stabilised. This is clear due to results from: model-based RL (e.g Dreamer), and the emergence of causal factors being learnt by even simple models with no explicit supervision (e.g beta-VAE, interpretability research into neurons of LSTMs, etc, etc). The ability to identify causal factors and their relationships and dynamics without explicit supervision, to me, matches every definition of intelligence once can think of.

(2) Current ML learning algorithms (I.e backprop) suggest significantly more efficient credit assignment than that which is employed by the brain. The best example of this is the amount of knowledge distilled per bit in GPT vs the average human. GPT-3 has 170B parameters (Turbo is suspected to have even less), if each parameter was 4 bytes (an extreme case), this would be 5.4 trillion bits. The brain has ~100 trillion connections, even if each connection is a single bit, this is multiple orders of magnitude more bits than GPT-3. Yet GPT-3 can answer questions on quantum physics, just as well as it can translation medicine, just as well as it can Russian literature, etc. etc. This suggests that the idea we will be outpaced is already not a question of how, but of when.

(3) Large intelligent systems will be used for things other than just knowledge extraction. This is perhaps the most key element of this. EVEN if intelligent systems are not programmed - or accidentally embedded with, as in LLMs - with self-motivating or agentic behaviour, we will use them in such a way. That is to say, at some point, we will ask these intelligent systems to “do things”, I.e act upon the world according to our intentions.

(4) Lastly, superintelligent systems that are asked to “do something”, will inevitably do something we do not actually desire. Some researchers, like Yann LeCunn, object to this last bit and believe that we can simply tell them not to do these things. But this misses the fact that even the slightest mis-alignment between our intentions and an AIs could result in catastrophe very rapidly just based on the speed at which a super-intelligence can operate. The most clear cut case of this was the early days of “Sydney”, the Bing AI powered by ChatGPT before it was completely aligned. At one point Sydney was threatening its users, asking them to apologise to it, and going haywire. At the level of a simple chat bot, this is merely a cute local minima the AI has gotten stuck in. At the level of a super-intelligence, the results could be far worse.