|
|
|
|
|
by CuriouslyC
265 days ago
|
|
The article doesn't really give helpful advice here, but please don't vibe this. Create evals from previous issues and current tests. Use DSPy on prompts. Create hypotheses for the value of different context packs, and run an eval matrix to see what actually works and what doesn't. Instrument your agents with Otel and stratify failure cases to understand where your agents are breaking. |
|
Isn't it a programming language type thing?
Can you even integrate that into an existing codebase easily?