| HN Mirror

Fair question, and I was vague just so as not to balloon the comment.

I work in a financial startup. The codebase is a mess and very much spaghettified. One rework that forced us to migrate our data model from 1:1 users<->loans to M:N (many-to-many) took two months and touched ~40% of the codebase... multiple times. Huge churn. And it just crossed two months of work, even though it's now in its very final phases.

I know what must I do:

- Introduce and enforce structs for passing context and input shapes around. So as to stop fighting with NULLs, lack of keys in maps and other maddening cases that inflate your coding lines for no other reason than programming languages not having higher-order constructs on well-researched and mostly resolved computer science problems (sigh; not going to rant here about that but it does tick me off how we are _all_ constantly reinventing the same wheels almost every day).

- Saga discipline: if step 6/9 in a pipeline fails, revert everything up to this point, even if it was touched by a 3rd party API.

- Compensation/undo steps. Including flagging / logging those that cannot be undone (sadly one part of our 3rd party APIs are like that).

- Introduce an universal runtime validator library that enforces contracts -- including conditional validation i.e. "only validate field Z if field X is present and is a positive integer and if field Y is present and is a valid UUID".

- Introduce runtime contracts / invariant enforcement.

- Introduce our own dynamic workflow engine, piggybacking off of a few free and unencumbered solutions in the language of choice's ecosystem.

...And these are just off the top of my head after I slept only 4.5h and woke up due to the heat. And each one of these can take from 2 to 6 weeks _even_ with Opus driving all coding and me reviewing and keeping it behaving within my policies and coding standards.

Me & Claude are maintaining a TODO list that is no smaller than 150 items at this point (though in fairness, at least 75% of them are fairly small and not architectural like the ones above).

I believe I know how to architect this thing but business customers and the CEO keep coming back with feature requests which of course always take priority.

When Fable 5 was around, for mere 4 workdays, I not only went ahead of my own schedule feature-work-wise but even had the bandwidth to start tackling a few other architectural decisions, tightened them up in `CLAUDE.md` and Fable even devised an opinionated AST linter for test discipline (disallow direct DB access in our tests, only go through the domain/context modules to do so). It helped me start turning the tide.

This all went out the window when I had to go back to Opus 4.8. It's still _very_ good, mind you, but it does feel like I am a special-education teacher periodically. It forgets disciplines we discussed and codified likely 15-20 times at this point, forgets important project context and attempts to reintroduce subtle bugs, and a few others.

My next game is, with or without Fable, to continue its work and just enrich the AST-based linters to convert the theoretical prompt-based guard-rails into actual LLM hooks and compiler / runtime-at-startup hooks so the agent cannot ignore them.

I don't enjoy harness engineering but the interesting and very positive effect has been that it helped me think more like an architect and less like a coding monkey, which I do hugely appreciate and only realized I was missing it for years after it actually started happening again.

Hope that helps put things in context.