Hacker News new | ask | show | jobs
by tough 311 days ago
but unlike cuda there's no custom kernels for inference in vllm repo...

I think