Hacker News new | ask | show | jobs
by tbt 3564 days ago
> This allows you to state and show meta theorems "for all (small) types", by quantifying over a universe.

As you say, this isn't quite self-trust. Another natural move is to relax the criterion of self-trust away from "it proves itself consistent" (impossible by incompleteness) towards "it assigns high probability that it has good beliefs". (Formalizations of this are in the paper.)

> At the moment I'm just confused what the problem is

In short, it would be nice to have a model of "good reasoning under deductive limitation", where "good reasoning" means something like "has accurate beliefs about all questions of interest" (for example, facts about the outputs of long-running computations), and where "deductive limitation" rules out the reasoning process "just wait for your theorem prover to decide the question".

Examples of long-running computations that are hard to compute exactly, but that we can sometimes still have reasonable beliefs about: optimal moves in chess, go, etc.; the accuracy of some ML system after a given training regimen; a weather-forecasting program that runs a gigantic series of simulations; and so on.

1 comments

So by now I've actually read the (abridged) paper and skimmed the full paper. And the paper has nothing to do with consistency problems, since they only consider boolean propositional logic... Assigning consistent probabilities in the limit is indeed a much more interesting problem anyway.

Self-trust seems like a problem on paper, but the reality is that normal mathematics uses very few universes. Concretely, the proof of the Feit-Thompson theorem in Coq, with all the related theories, uses 4 universes. If you have a system certified in type theory, then you can still reason about it. Possibly one universe higher, but that is not a problem (as far as anybody knows).

>Self-trust seems like a problem on paper, but the reality is that normal mathematics uses very few universes.

Self-trust (broadly construed) is interesting to me because it seems relevant to designing goal-based agents that are "stable", in the sense that they trust that future versions of themselves will have accurate beliefs (and therefore don't have an incentive to mess around with their systems for forming beliefs). If we try to formalize this intuition with "beliefs" as theorems proven by a formal system, we run into reflection problems; having your theorem prover assert that it will keep outputting only true statements feels awfully close to asserting its own soundness. So even if your agent can perform all the usual mathematical reasoning it needs, it still can't do all the useful reasoning about itself (it would need another large cardinal... and then another...).

The self-trust property in the paper says that it's possible to "learn from experience" that your future self is probably going to have pretty good beliefs. Specifically, a logical inductor P_n learns (roughly speaking) that "if P_f(n) thinks Phi is likely, then Phi is likely", where f(n) can be a fast-growing computable function. That is, on day n, P_n believes a sort of "probabilistic soundness" condition for its future self P_f(n). This is weaker than full soundness in at least two ways, but it is fully "reflective" in the sense that P believes this of itself.