|
|
|
|
|
by kevinsmith51
20 days ago
|
|
Would use a capability based routers so you can use a blend of OSS models. I.e. use the least capable model per prompt that includes the appropriate tooling capability, etc. Can even include a frontier provider subscription and get almost as many tokens at very close to the benchmarking on a $20/mo subscription as a $200/mo subscription. Easier with Claude's bearer token setup but I have seen people do it with OpenAI subscriptions as well. |
|