It seems the fundamental problem with bottom up/learning AI is that it is opaque and essentially unknowable. I find it all very hackish. We can develop systems now which we can test and seem to work, but we don't know exactly why they work (eg: what parts of the training data they are promoting) and when (or why) they will fail. The effectiveness of adversarial inputs to trained vision systems illustrates this.
Zoom forward to a super-human AI that mimics our brains in its approach but exceeds its capacity. What is stopping it, for instance, learning that it can play the long game of being good until it has sufficient power at its disposal and then becoming evil? No matter what training data you present, you can't know exactly what the result will be.
I get the feeling that learning systems will be combined with model systems with the former performing "low level" tasks and the latter providing a verifiable "executive" that guides high level goals or outcomes.
One approach being considered is "AI Safety Via Debate"[0], which hopes to prevent deception by carefully constructing games in which a superhuman agent's best strategy is honesty. Note that this is the goal; much work to be done!
Forget AIs - we need this for humans to design legal and administrative systems.
I have pondered if it would be a workable field to have incentive based design in a formalized way to ensure that even a complete sociopath would find acting in a beneficial way the best option.
Do we know the entire game theory well enough so that we can structure such games with no theoretical way for AI to sneak out? I doubt that, but even so, funny things start happening when theory meets practice. I recall the example of quantum entanglement, which (I read) enables communications that cannot be spied upon without the intended parties knowing. Except, (I also read) it was attacked at the interface between quantum and classical domain. The world is complex, and superhuman AI is by definition better equipped to find loopholes than humans are.
Unfortunatley being dishonest or evil is just one example. Arguably the AI can develop new classes of deviancy, abuse or maladaptation that we haven't conceptualized yet. We supersize the ability, surely we supersize the problems.
It leads to a scary question: what does a superhuman AI really want?
To be fair a HFT agent can count as superhuman AI technically. Wanting isn't a thing that applies yet to actual AI and there is no special sauce that indicates advancement beyond neuron scale. Barring directives and assuming "grown" what it wants can be utterly peripheral to rationality and likely based on what it is taught - internationally or not. Look at how society preaches honesty from a young age and then starts teaching lying again by rewarding it. The real lesson is the spartan one on stealing- don't get caught. It may not be intended but it is the result.
> which hopes to prevent deception by carefully constructing games in which a superhuman agent's best strategy is honesty
I'd be very hesitant to assume that an agent cannot learn under which circumstances it should be honest to gain a benefit without putting any innate value on honesty. A human agent is more than capable of reasoning like that, let alone a superhuman one.
I attended this talk at IJCAI, and I must say that the whole system 1 / system 2 analogy rubbed me the wrong way.
A solver for e.g. 3-SAT is general only in a very narrow sense, namely that an entire class of problems can be reduced to the specific problem it solves. However, the solver itself is not doing the reducing, rather it is being spoon-fed instances generated by somebody, and that somebody is doing all the hard work of actually thinking. The solver is just doing a series of dumb steps very quickly, with lots of heuristics thrown in. How is that not also "system 1"?
Anyway, the whole thing was just a fancy way of saying that you can either solve problems exactly, in the way that complexity theorists and algorithm designers do things, or statistically, in the way that learning theorists do things. No need to superimpose a strained analogy.
Not to mention that there is no conclusive evidence of the dual process theory yet, see for example this experimental study finding that logical "type 2" answers are actually typically faster and that intuitive "type 1" answers are typically also logical:
> Not to mention that there is no conclusive evidence of the dual process theory yet
Define "conclusive". There is considerable evidence for this dual reasoning mode.
As for your study, system 1 thinking is not inherently illogical. In fact, it's necessarily logical otherwise it would be maladaptive. The point is that it's logical in a "lossy" way that sometimes excludes pertinent information for speed of response, and so sometimes goes wildly wrong.
Yes, I know what you mean. In my opinion, the connection to the System 1 / System 2 theories did not add much depth to the paper. I think the intended purpose was to bolster the argument that both learners and solvers (operating in different ways) are both useful forms of intelligence. However, this point can be made in other ways as well.
In any case, I look forward to more scholarship and experimentation at the intersection of these topics.
All definitions of things get circular at some point. What defines a chair? The set of criteria you come up with to divide chairs from other things has to invariably turn in on itself, as the hilarious exercise to define a sandwich illustrates. All identity is ultimately an illusion.
In other words, the more division you create in the world, the more 'specialness' you create. And in the immortal words of Syndrome, when everyone's special, no one is.
With a fine enough definition of thought, anything can fit the definition.
>I attended this talk at IJCAI, and I must say that the whole system 1 / system 2 analogy rubbed me the wrong way.
It immediately rubs me in the wrong way because dual process theories of the brain are wrong and outmoded and need to die out of the public consciousness now!
Bayesian brain theories are more "in vogue", along with various other theories saying that the brain does some forms of statistical and causal learning and inference.
They don't preclude it, but they didn't happen to include it in our particular history. In particular, in the evolutionary history of the brain as an energy-optimizing controller of the body, a "System 1" would have been selected against extremely early on, when it directed the internal organs to act according to "heuristics" that wasted calories.
AGI will be a reflection of ourselves.
First we must resolve the basic problems of the human condition (poverty, hunger, housing, war, ...) before developing AGI as it will surely amplify our worst nature as well as our best nature.
Zoom forward to a super-human AI that mimics our brains in its approach but exceeds its capacity. What is stopping it, for instance, learning that it can play the long game of being good until it has sufficient power at its disposal and then becoming evil? No matter what training data you present, you can't know exactly what the result will be.
I get the feeling that learning systems will be combined with model systems with the former performing "low level" tasks and the latter providing a verifiable "executive" that guides high level goals or outcomes.