Hacker News new | ask | show | jobs
vLLM: An Efficient Inference Engine for Large Language Models [pdf] (www2.eecs.berkeley.edu)
2 points by ankitg12 12 days ago