Y
Hacker News
new
|
ask
|
show
|
jobs
by
bt1a
127 days ago
This is most likely an inference serving problem in terms of capacity and latency given that Opus X and the latest GPT models available in the API have always responded quickly and slowly, respectively