|
|
|
|
|
by taosx
85 days ago
|
|
The only thing I'm wondering is if they have eval frameworks (for lack of a better word). Their prompts don't seem to have changed for a while and I find greater success after testing and writing my own system prompts + modification to the harness to have the smallest most concise system prompt + dynamic prompt snippets per project. I feel that if you want to build a coding agent / harness the first thing you should do is to build an evaluation framework to track performance for coding by having your internal metrics and task performance, instead I see most coding agents just fiddle with adding features that don't improve the core ability of a coding agent. |
|
I considered creating a PR for that, but found that creating new agents instead worked fine for me.