Y
Hacker News
new
|
ask
|
show
|
jobs
by
joaquincabezas
945 days ago
Thanks a lot for the material Varun, neat presentation with exhaustive computations that make it easy to follow. Question on the serving part: vLLM, Deepspeed, TensorRT-LLM... ? Thanks!
1 comments
varunshenoy
945 days ago
Thanks!
vLLM for quick set up, TRT-LLM for best performance. Both available on
https://baseten.co/
.
link
vLLM for quick set up, TRT-LLM for best performance. Both available on https://baseten.co/.