| HN Mirror

I believe we have two rather different settings in mind. My statement assumes the enterprise use-case, where having a verifier is required. (In this context, I'm also assuming the approach of constraining against the observed data.) In such a selective classification setting, the end-user need not be exposed to lower quality outputs, but rather null predictions if the model cascade has been exhausted (i.e., progressively moving to larger models until the probability is acceptable).

Hopefully in 2024 we can get at least one of the benchmarks to move to assessing non-parametric/distribution-free uncertainty for selective classification, reflecting more recent CS/Stats advances that should be used in practice. Working on it.