Hacker News new | ask | show | jobs
user: zhwu
created: 2022-10-05
karma: 37

submissions:

0 points | 0 comments
A collection of reproducible LLM inference engine benchmarks: SGLang vs. vLLM
1 points | 0 comments
0 points | 0 comments
0 points | 0 comments
Efficient GPU Resource Management for ML Workloads Using SkyPilot, Kueue on GKE
2 points | 0 comments
0 points | 0 comments
New Recipe: Serving Llama-2 with VLLM's OpenAI-Compatible API Server
1 points | 0 comments
Train Your Own Vicuna on Llama-2
3 points | 0 comments
Guide on fine-tuning your own Vicuna on Llama-2
9 points | 0 comments
0 points | 0 comments
0 points | 0 comments
0 points | 0 comments
0 points | 0 comments
Serving LLM 24x Faster on the Cloud with VLLM and SkyPilot
12 points | 1 comments
0 points | 0 comments
0 points | 0 comments