|
|
|
|
|
by lmeyerov
701 days ago
|
|
As users of otel, we are looking at reusing otel for our LLM stack, and as it is easy to instrument, don't need a new framework for that part. However, the more interesting part is the storage: Imagine ingesting 100pg PDFs or 1M tweets, and doing many/big LLM map/reduce with big (128K+) context. In observability land, we generally have small payloads, sample data, and retire data... and backends + pricing assumes that. In LLMs, we instead might want some hot, rest in the DWH, and store everything. How have folks been dealing with these kind of mismatches? Eg, Clickhouse backends for otel? Something else? Small stuff in otel and big stuff manually in a doc store / s3 json / parquet? |
|
The span can then only contain the most important data like the prompt template, model that was used, token usage, etc. You can then split the metadata (spans and traces) and the large payloads (prompts + completions) to different data stores.