|
|
|
|
|
by joelm
1125 days ago
|
|
Latency has been the biggest challenge for me. They cite "two to 15+ seconds" in this blog post for responses. Via the OpenAI API I've been seeing more like 45-60 seconds for responses (using GPT-3.5-turbo or GPT-4 in chat mode). Note, this is using ~3500 tokens total. I've had to extensively adapt to that latency in the UI of our product. Maybe I should start showing funny messages while the user is waiting (like I've seen porkbun do when you pay for domain names). |
|
...also why we can't wait for other vendors to get SOC I/II clearance, and I guess eventually fine-tuning our own model, so we're not stuck with situations like this.