Y
Hacker News
new
|
ask
|
show
|
jobs
by
Alyx1337
916 days ago
Thanks! There are ways to shave off the latency: hosting locally, using quantized/smaller models, streaming data instead of doing the tasks sequentially