Y
Hacker News
new
|
ask
|
show
|
jobs
user:
zhwu
created:
2022-10-05
karma:
37
submissions:
0 points
|
0 comments
A collection of reproducible LLM inference engine benchmarks: SGLang vs. vLLM
1 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
Efficient GPU Resource Management for ML Workloads Using SkyPilot, Kueue on GKE
2 points
|
0 comments
0 points
|
0 comments
New Recipe: Serving Llama-2 with VLLM's OpenAI-Compatible API Server
1 points
|
0 comments
Train Your Own Vicuna on Llama-2
3 points
|
0 comments
Guide on fine-tuning your own Vicuna on Llama-2
9 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
Serving LLM 24x Faster on the Cloud with VLLM and SkyPilot
12 points
|
1 comments
0 points
|
0 comments
0 points
|
0 comments