|
|
|
|
|
by harlanlewis
759 days ago
|
|
I found myself making these types of comparisons over and over again to keep up with new models and pricing changes, so I made a Google Sheet for it with some charts to visualize outliers and trends in terms of model capability, throughput, and token cost. It's sorted by the Artificial Analysis Index [1], which incorporates both static benchmarks (MMLU) and dynamic crowdsourced elo (chatbot arena). Hope someone finds it useful: https://docs.google.com/spreadsheets/d/1foc98Jtbi0-GUsNySddv... Specific to Groq - the service is an incredible outlier for speed and cost, but even for quite small projects I've run into rate limits and similar errors. Great option for something like a bring-your-own-keys personal chatbot, but I wouldn't build a service on it. Groq Llama 3 70B and Groq Mixtral 8x7B are in the sheet for comparison but omitted from charts. [1]: https://artificialanalysis.ai/leaderboards/providers |
|