Hacker News new | ask | show | jobs
by taosx 85 days ago
The only thing I'm wondering is if they have eval frameworks (for lack of a better word). Their prompts don't seem to have changed for a while and I find greater success after testing and writing my own system prompts + modification to the harness to have the smallest most concise system prompt + dynamic prompt snippets per project.

I feel that if you want to build a coding agent / harness the first thing you should do is to build an evaluation framework to track performance for coding by having your internal metrics and task performance, instead I see most coding agents just fiddle with adding features that don't improve the core ability of a coding agent.

2 comments

You can't write your system prompt in opencode, there's no API to override the default anthropic.txt as far as I'm aware.

I considered creating a PR for that, but found that creating new agents instead worked fine for me.

> You can't write your system prompt in opencode

Now I just started looking into OpenCode yesterday, but seems you can override the system prompts by basically overloading the templates used in for example `~/.opencode/agents/build.md`, then that'd be used instead of the default "Build" system prompt.

At least from what I gathered skimming the docs earlier, might not actually work in practice, or not override all of it, but seems to be the way it works.

I've forked it locally, to be honest I haven't merged upstream in a while as I haven't seen any commits that I found relevant and would improve my usage, they seem to work on the web and desktop version which I don't use.

The changes I've made locally are:

- Added a discuss mode with almost on tools except read file, ask tool, web search only based no heuristics + being able to switch from discuss to plan mode.

Experiments:

- hashline: it doesn't bring that much benefit over the default with gpt-5.4.

- tried scribe [0]: It seems worth it as it saves context space but in worst case scenarios it fails by reading the whole file, probably worth it but I would need to experiment more with it and probably rewrite some parts.

The nice thing about opencode is that it uses sqlite and you can do experiments and then go through past conversation through code, replay and compare.

[0] https://github.com/sibyllinesoft/scribe

It's simple enough to make a plugin to override the system prompts & make it flexible per agent.

Have to watch out for other plugins trying to do the same, though.