| HN Mirror

Why do we need 2 closed source API-only options?

It's limiting to not be able to call it through routers like LiteLLM & to make a new billing account

Not to mention local- these are presumably small models and I'd take 800 tokens/sec vs 4000/sec with latency any day