|
|
|
|
|
by oersted
546 days ago
|
|
Frankly, just Python. LLM generation is just a function call, fetching from a vector db is just a function call. LLMs are hard to tame at scale, the focus is on tightly controlling the LLM inputs, making sure it has the information it needs to be accurate, and having detailed observability over outputs and costs. For that last part this new wave of AI observability tools can help (Helicone, Langsmith, W&B Weave...). Frameworks like LangChain obscure the exact inputs and outputs and when the LLM is called. Fancy agentic patterns and one-size-fits-all RAG are expensive and their effectiveness in general is dubious. It's important to tightly engineer the prompt for every individual use-case and to think of it as a low-level input-output call, just like coding a good function, rather than a magical abstract intelligent being. In practice, I prefer to keep the control and simplicity of vanilla Python so I can focus on the actually difficult part of prompting the LLM well. |
|