| Thanks! Those are all good questions. Let me respond to them one by one, > Can I just use arch for routing between LLMs Yes, you can use arch_config.yaml file to select between LLMs. In fact we have a demo on llm_routing [1] that you can try. Here how you can specify different LLMs in our config, llm_providers:
- name: gpt-4o-mini
access_key: $OPENAI_API_KEY
provider: openai
model: gpt-4o-mini
default: true
- name: gpt-3.5-turbo-0125
access_key: $OPENAI_API_KEY
provider: openai
model: gpt-3.5-turbo-0125
- name: gpt-4o
access_key: $OPENAI_API_KEY
provider: openai
model: gpt-4o
- name: ministral-3b
access_key: $MISTRAL_API_KEY
provider: mistral
model: ministral-3b-latest
> And what LLMs do you supportWe currently support mistral and openai. And for both of them we support streaming interface. We do expose openai complaint v1/chat interface so any chat UI that works with openai should work with us as well. We do ship demos with gradio sample application. > And what about key management? Do I manage access keys myself? None of your clients need to manage access keys. Upon receipt of request our filter will appropriate LLM from arch_config and pick relevant access_key and modify request with access_key from arch_config before sending request to upstream LLM [2]. [1] https://github.com/katanemo/archgw/tree/main/demos/llm_routi... [2] https://github.com/katanemo/archgw/blob/main/crates/llm_gate... |