User: zhwu | HN Mirror

Y	Hacker News new \| ask \| show \| jobs

user: zhwu
created: 2022-10-05
karma: 37

submissions:

VRAM Ghost Busting: Who You Gonna Close()?

4 points | 0 comments

0 points | 0 comments

A collection of reproducible LLM inference engine benchmarks: SGLang vs. vLLM

1 points | 0 comments

0 points | 0 comments

0 points | 0 comments

Efficient GPU Resource Management for ML Workloads Using SkyPilot, Kueue on GKE

2 points | 0 comments

0 points | 0 comments

New Recipe: Serving Llama-2 with VLLM's OpenAI-Compatible API Server

1 points | 0 comments

Train Your Own Vicuna on Llama-2

3 points | 0 comments

Guide on fine-tuning your own Vicuna on Llama-2

9 points | 0 comments

0 points | 0 comments

0 points | 0 comments

0 points | 0 comments

0 points | 0 comments

Serving LLM 24x Faster on the Cloud with VLLM and SkyPilot

12 points | 1 comments

0 points | 0 comments