I believe there's a level of diminishing returns. Sure, SOTA will probably always benchmark better than local models. But do we need it? That's the question that the likes of OpenAI and Anthropic should be worried about.
The difference won't be in the individual tasks. It'll be in the scale of job they can take on and how you interact with the model. Think of pairing with a junior vs replacing a full delivery team, that's the sort of difference we'll be looking at. We'll be able to get closer to the latter by being more clever with harnesses, I reckon, but the frontier labs will run ahead because for any given harness trick they can lean harder on model smarts.
True, but my point is that if/when local models get to the point where they are capable of doing the "delivery team" work what's next? What can these bigger SOTA models offer? And especially what can they offer above and beyond what you might be able to get from much cheaper models which the open models are based on?