This seems to be another over optimization for AI that many are trying to get into. The LLM's improve, and your setup is deprecated, you wasted time optimizing for a slight edge. TDLR: You trade time for slight edge.
i don't disagree, though harness engineering is a real discipline that even the best AI labs put their brightest minds on, and the loop itself doesn't deprecate when models improve.