Hacker News new | ask | show | jobs
by sdesol 502 days ago
I wrote a chat app built around mistrust for LLM responses. You can see an example here:

https://beta.gitsense.com/?chat=ed907b02-4f03-477f-a5e4-ce9a...

If you click on the Evaluation links, you can see how you can use multiple LLMs to validate LLM response. The evaluation of the accurate response is interesting since Llama 3.3 was the most critical.

https://beta.gitsense.com/?chat=fdfb053d-f0e2-4346-bdfc-7305...

At this point, you would ask Llama to explain why the response was not 100% which you can use to cross reference other LLMs or to do your own research.