Hacker News new | ask | show | jobs
by anonzzzies 1162 days ago
How does this work? Does anyone know?

And for a large swats of things, how can it possibly work? It’s not possible to say if or if not it is hallucinating code for almost all code and apis, for instance. And I see similar issues with many fields outside pure facts. With privacy issues as well.

2 comments

Looking at their “documentation”: https://docs.giskard.ai/start/

It would appear that this is not automated monitoring but more like a second stage of human reinforcement learning or perhaps a classifier. It seems that you create input/output examples and the LLM responses are examined by a secondary system (which I’m guessing is probably NOT an LLM, otherwise it would be vulnerable to attacks) and perhaps force regenerates the LLM response if it doesn’t meet the classification threshold.

At least, that sounds more believable to me than someone claiming they’ve fixed the inherent flaws in LLMs.

We are a team of engineers & researchers on AI Alignment & Safety, we're investigating multiple methods, including metamorphic testing, human feedback, benchmarks with external data sources, and LLM explainability methods.

Currently, fact checking works on straight facts. It does a Google Search and uses LLMs to shorten it. Once it has the short version, it will compare the short results with the answer provided by ChatGPT itself. Premium tiers would get better fact checking sources than just google. We're investigating various data sources and comparison methods.

Note that fact checking / hallucinations is just one of the types of satety issues we'd like to tackle. Many of these are still open questions in the research community, so we're looking to build and develop the right methods for the rights problems. We also think it's super important to have independent third-party evaluations to make sure these models are safe.

This is a new tool we're building in the open, and we're interested in your feedback to prioritize!

> Currently, fact checking works on straight facts.

Wow, you guys have a database of all the facts?

> It does a Google Search and uses LLMs to shorten it.

Oh...

...actually, this is an empirical fact checker. I wouldn't call it "fact-based", as it's epistemologically an absurd statement, but "empirical fact checking" sounds good and presents an idea that is very close to how humans verify information in the first place - by checking multiple sources and searching for correlation.

For what it's worth, I think your approach makes sense. Good luck.

> Currently, fact checking works on straight facts. It does a Google Search and uses LLMs to shorten it.

So your fact-checking LLM is also vulnerable to injection and unethical prompting then when it ingests website text. And a Google search is far, far away from fact checking, particularly for the subtle errors that GPT-4 is prone to making.