|
|
|
|
|
by fitzn
810 days ago
|
|
What open source model are you using when you hit groq? I just benchmarked some perf for some of my larger context window queries last week and groq's API took 1.6 seconds versus 1.8 to 2.2 for OpenAI GPT-3.5-turbo. So, it wasn't much faster. I almost emailed their support to see if I was doing something wrong. Would love to hear any details about your workload or the complexity of your queries. |
|
In most of the cases, overall response time is mostly dominated by output as it is ~100x slower per token than input.