Y
Hacker News
new
|
ask
|
show
|
jobs
by
cloudking
911 days ago
Wonderful hack, the overall response latency is the only thing that hurts the UX, if you can get the response time down would be epic. Nice work.
1 comments
Alyx1337
911 days ago
Thanks! There are ways to shave off the latency: hosting locally, using quantized/smaller models, streaming data instead of doing the tasks sequentially
link