| Processing high volumes of unstructured data (text)… we’re using a STAG architecture. - Generate targeted LLM micro summaries of every record (ticket, call, etc.) continually - Use layers of regex, semantic embeddings, and scoring enrichments to identify report rows (pivots on aggregates) worth attention, running on a schedule - Proactively explain each report row by identifying what’s unusual about it and LLM summarizing a subset of the microsummaries. - Push the result to webhook Lack of JSON schema restriction is a significant barrier to entry on hooking LLMs up to a multi step process. Another is preventing LLMs from adding intro or conclusion text. |
(Plug) I shipped a dedicated OpenAI-compatible API for this, jsonmode.com a couple weeks ago and just integrated Groq (they were nice enough to bump up the rate limits) so it's crazy fast. It's a WIP but so far very comparable to JSON output from frontier models, with some bonus features (web crawling etc).