Hacker News new | ask | show | jobs
by unraveller 504 days ago
What's this stuff about the model catering to ‘80%’ of generative AI tasks? What model do they expect me to use for the other 20% of the time when my question needs reasoning smarts.
4 comments

There are APIs that use a very small model to determine the complexity of the request then route it to different apis or models based on the result of that classifier model.

This way you can do cheap/local automatically without the api client having to know anything about it, and the proxy will send the requests out to an expensive big model only when necessary.

Crazy idea: a small super fast model whose only job is to decide which model to send your task to.
Not so crazy, it sorta exists https://withmartian.com so it's probably a good idea to pursue
This already exists but I forgot the name. It’s an api proxy.
Take your pick based on your use cases and needs?
Mistral Large