Hacker News new | ask | show | jobs
by marcklingen 855 days ago
Fully agree - even as a founder of an ‘LLM observability company’. Observability does not need to be reinvented to get detailed traces/metrics/logs of the LLM part of an application.

LLM Observability usually means: prompts and completions, which model was used, errors and exceptions (rate limits, network errors), as well as metrics (latency, output speed, time to first token when streaming, USD/token and cost breakdowns). All of this is well suited to be captured in the existing observability stack. OpenLLMetry makes this really easy and interoperable - chapeau.

In my view, observability is not the core value that solutions like Baserun, Athina, LangSmith, Parea, Arize, Langfuse (my project) and many others solve for. Developing a useful LLM application requires iterative workflows and tinkering. That's what these solutions help with and augment.

There are specific problems to building an LLM application such as managing/versioning of prompts, running evaluations, blending multiple different evaluation sources, collecting datasets to test/benchmark an application, helping with fine-tuning models on high-quality production completions, debugging root causes of quality/latency/cost issues, ...

Most solutions either replicate logs (LLM I/O) or traces at first, as they are a necessary starting point to then build solutions for the other workflow problems. As the observability piece gets more standardized over time, I can see how integrating with the standard makes a ton of sense. Always happy to chat about this.

1 comments

Hey Marc :wave:

Would love to see you integrate and adopt this as soon as it makes sense to you. OpenTelemetry is a great and mature piece of technology and we should all be aligning around it now, while it’s still easy to do so.