Hacker News new | ask | show | jobs
by latentnumber 660 days ago
Why not automate verification itself then? While not possible now, and I would probably never advocate for using LLMs in critical settings, it might be possible to build field-specific verification systems for LLMs with robustness guarantees as well.
2 comments

If the verification systems for LLMs are built out of LLMs, you haven't addressed the problem at all, just hand-waved a homunculus that itself requires verification.

If the verification systems for LLMs are not built out of LLMs and they're somehow more robust than LLMs at human-language problem solving and analysis, then you should be using the technology the verification system uses instead of LLMs in the first place!

> If the verification systems for LLMs are not built out of LLMs and they're somehow more robust than LLMs at human-language problem solving and analysis, then you should be using the technology the verification system uses instead of LLMs in the first place!

The issue is not in the verification system, but in putting quantifiable bounds on your answer set. If I ask an LLM to multiply large numbers together I can also very easily verify the generated answer by topping it with a deterministic function.

I.e. rather than hoping that an LLM can accurately multiply two 10 digit numbers, I have a much easier (and verified) solution by instead asking it to perform this calculation using python and reading me the output

Spitballing, if you had a digital model of a commercial airplane, you could have an llm write all of the component code for the flight system, then iteratively test the digital model under all possible real world circumstances.

I think automating verification generally might require general intelligence, not an expert though.