| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by philipkiely 321 days ago
	TRT-LLM has its challenges from a DX perspective and yeah for Multi-modal we still use vLLM pretty often. But for the kind of traffic we are trying to serve -- high volume and latency sensitive -- it consistently wins head-to-head in our benchmarking and we have invested a ton of dev work in the tooling around it.