| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by joaquincabezas 945 days ago
	Thanks a lot for the material Varun, neat presentation with exhaustive computations that make it easy to follow. Question on the serving part: vLLM, Deepspeed, TensorRT-LLM... ? Thanks!

1 comments

Thanks!

vLLM for quick set up, TRT-LLM for best performance. Both available on https://baseten.co/.