And no, they're not as capable as SOTA models. Not by far.
However they can help reduce your token expenditure a lot by routing them the low-hanging fruit. Summaries, translations, stuff like that.
But looking at it it's just an interface to the cloud LLMs? The OP's question was about local models.