Hacker News new | ask | show | jobs
by alextttty 668 days ago
Thank you! yeah great source. Do they track throughput for open source models? and inference engines Thats kind of data I want to find as well
1 comments

For throughput data, well, you need to actually run prompts to gather the data which racks up costs fast and performance can vary based on input prompt lengths. The two sources I use are OpenRouter's provider breakdown [1] and Unify's runtime benchmarks [2].

[1]: https://openrouter.ai/models/meta-llama/llama-3.1-70b-instru...

[2]: https://unify.ai/benchmarks/llama-3.1-70b-chat