| Litellm proxy is a pretty good project on its own. I am obviously biased because we are competitors. Here are my thoughts. * Litellm is declarative and it let you define everything in yaml
* Bricks is not declarative and you control everything via API * Litellm does not have an UI
* Bricks has a non open source UI * Litellm is written in python
* Bricks is written in Golang * Litellm does not persist rate limits. Therefore can't accurately rate limit across distributed instances
* Bricksllm let you create API keys with accurate rate limits and spend limits that work across distributed instances * Litellm provides high level spend metrics on API keys
* Bricks provides granular spend, request and latency metrics breakdown by model and custom id * Litellm is not compatible with OpenAI SDK. You have to adopt Litellm python client
* Bricks is designed to be compatible with OpenAI SDK * Litellm only supports OpenAI completion and embedding
* Bricks supports almost all OpenAI endpoints except image and audio * Litellm has exact request caching
* Bricks does not have caching as for now * Litellm has OpenTelemetry integration
* Bricks has statsd integration * Litellm supports orchestration of API calls. When this API call fails, use this model or call this API endpoint instead
* Bricks does not support orchestration of API calls since I believe that it is something that the client should handle |
``` import openai
client = openai.OpenAI( api_key="anything", # proxy key - if set base_url="http://0.0.0.0:8000" # proxy url )
# request sent to model set on litellm proxy,
response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [ { "role": "user", "content": "this is a test request, write a short poem" } ])
print(response)
``` Docs - https://docs.litellm.ai/docs/proxy/quick_start