| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by asif_ 757 days ago

Hey, thanks for the question. Are you talking about standard evaluation tools like promptfoo? These evaluation frameworks are often just tools that helps you grade the response of your LLM application. They however do not help you to build an LLM application that makes it easy to test different configurations of your application and evaluate them. That is where we different -- we help you build an application that is made for easily testing different configurations of your application so you can evaluate them much faster.

So the process we see when companies are trying to adopt a evaluation framework is that when they want to try a new configuration, they completely change their code-base, create the code to run an evaluation, and review that result independently and try to compare with other changes they have made sometimes in the past. This usually leads to a very slow process for making new changes and becomes very unorganized.

With us, we help you build your LLM application where it's easy to swap components. From there, when you want to see how your application works with a certain configuration, we have a UI where you can pass in the configuration settings for your application, and run an evaluation. We also save all your previous evaluations so you can easily compare them with each other. As a result, it's very easy and fast to test different configurations of your application and evaluate them with us.