| My suggestions: - like the sister comment says, use the best model available. For me that has been opus but YMMV. Some of my colleagues prefer the OAI models. - iterate on the plan until it looks solid. This is where you should invest your time. - Watch the model closely and make sure it writes tests first, checks that they fail, and only then proceeds to implementation - the model should add pieces one by one, ensuring each step works before proceeding. Commit each step so you can easily retry if you need to. Each addition will involve a new plan that you go back and forth on until you're happy with it. The planning usually gets easier as the project moves along. - this is sometimes controversial, but use the best language you can target. That can be Rust, Haskell, Erlang depending on the context. Strong types will make a big difference. They catch silly mistakes models are liable to make. Cursor is great for trying out the different models. If opus is what you like, I have found Claude code to be better value, and personally I prefer the CLI to the vscode UI cursor builds on. It's not a panacea though. The CLI has its own issues like occasionally slowing to a crawl. It still gets the work done. |
I spend a lot of time on plans, but unfortunately the gotchas are in the weeds, especially when it comes to complex systems. I don't trust these models with even marginally complex, non-standard architectures (my projects center around statecharts right now, and the semantics around those can get hairy).
I git commit after each feature/bugfix, so we're on the same page here. If a feature is too big, or is made up of more than one "big" change, I chunk up the work and commit in small batches until the feature is complete.
I'm running golang for my projects right now. I can try a more strongly typed language, but that means learning a whole new language and its gotchas and architectural constraints.
Right now I use claude-code-router and Claude Code on top of openrouter, so swapping models is trivial. I use mostly Grok-4.1 Fast or Kimi 2.5. Both of these choke less than Anthropic's own Sonnet (which is still more expensive than the two alternatives).