Hacker News new | ask | show | jobs
by robbensinger 3809 days ago
Speaking as a MIRI employee, I can say that MIRI isn't trying to build AI that's mathematically provable to be safe. This misconception comes from the same place the "AI safety engineering, etc." post is speaking to -- the assumption that if (e.g.) we do work in provability theory to develop simple general models of reflective reasoning, then the finished product must fall within provability theory.

Smarter-than-human AI systems will presumably reason probabilistically, and all real-world safety guarantees are probabilistic. But theorem-proving can be useful in some contexts for making us quantitatively more confident in systems' behavior (see https://intelligence.org/2013/10/03/proofs/), and toy models of theorem-proving agents can also be useful just for helping shore up our understanding of the problem space and of the formal tools that are likely to be relevant down the line -- the analog of "calculus" in the rocket example.