Hacker News new | ask | show | jobs
user: george_123
created: 2019-12-02
karma: 9

submissions:

Loading Llama-2 70B 20x faster with Anyscale Endpoints
3 points | 0 comments
0 points | 0 comments
0 points | 0 comments
How continuous batching improves LLM inference throughput 23x
1 points | 0 comments
0 points | 0 comments
0 points | 0 comments
Ant Group – scaling to 1.37M QPS on Ray
3 points | 1 comments
0 points | 0 comments