|
|
|
Show HN: Reference-free evaluation of LLM-powered chatbots
(github.com)
|
|
2 points
by Joschkabraun
934 days ago
|
|
Hey HN! This an interactive demo with a *somewhat* helpful AI assistant. The goal is to demonstrate a good way to reference-free evaluate interactions between humans and AI assistants. Reference-free means that you do not provide a correct answer to a query. The used metric in this context is the goal success ratio, which measures how many queries a user needs to send to reach their goal. In the near future, there will be a guide on how to reference-free evaluate any LLM app (chat, RAG, summarization, etc.). Try it out and please share any feedback! |
|