Hacker News new | ask | show | jobs
by sneak 500 days ago
There are APIs that use a very small model to determine the complexity of the request then route it to different apis or models based on the result of that classifier model.

This way you can do cheap/local automatically without the api client having to know anything about it, and the proxy will send the requests out to an expensive big model only when necessary.