Y
Hacker News
new
|
ask
|
show
|
jobs
user:
george_123
created:
2019-12-02
karma:
9
submissions:
Loading Llama-2 70B 20x faster with Anyscale Endpoints
3 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
How continuous batching improves LLM inference throughput 23x
1 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
Ant Group – scaling to 1.37M QPS on Ray
3 points
|
1 comments
0 points
|
0 comments