Hacker News new | ask | show | jobs
by didericis 1284 days ago
The article focuses on human overrides, but I think the more obvious and gaping security issue is lack of any significant ability to verify output correctness whether it’s intentionally gamed or not.

I predict nearly all of the upcoming LLM products will end up being fancy autocomplete suggestions a user will then have to feed into a more constrained system with some sort of manual confirmation/tweaking.

1 comments

Verifying language models is going to make the difference between useful and useless. I predict in 12 months we'll have a fact checking neural network, possibly with an additional text index of verified facts.

And all these hacks are going away in the next point release, they just need to collate them all and add them to the training set. There are still going to be adversarial attacks though. That's hard to guard against, but they won't be created manually, we'll need algorithms to find them.

Here's a fact-checking LLM for the LLMs: https://arxiv.org/abs/2210.08726