| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by adilhafeez 583 days ago

Thanks! Those are all good questions. Let me respond to them one by one,

> Can I just use arch for routing between LLMs

Yes, you can use arch_config.yaml file to select between LLMs. In fact we have a demo on llm_routing [1] that you can try. Here how you can specify different LLMs in our config,

  llm_providers:
    - name: gpt-4o-mini
      access_key: $OPENAI_API_KEY
      provider: openai
      model: gpt-4o-mini
      default: true

    - name: gpt-3.5-turbo-0125
      access_key: $OPENAI_API_KEY
      provider: openai
      model: gpt-3.5-turbo-0125

    - name: gpt-4o
      access_key: $OPENAI_API_KEY
      provider: openai
      model: gpt-4o

    - name: ministral-3b
      access_key: $MISTRAL_API_KEY
      provider: mistral
      model: ministral-3b-latest

> And what LLMs do you support

We currently support mistral and openai. And for both of them we support streaming interface. We do expose openai complaint v1/chat interface so any chat UI that works with openai should work with us as well. We do ship demos with gradio sample application.

> And what about key management? Do I manage access keys myself?

None of your clients need to manage access keys. Upon receipt of request our filter will appropriate LLM from arch_config and pick relevant access_key and modify request with access_key from arch_config before sending request to upstream LLM [2].

[1] https://github.com/katanemo/archgw/tree/main/demos/llm_routi...

[2] https://github.com/katanemo/archgw/blob/main/crates/llm_gate...