Hacker News new | ask | show | jobs
by arnaudsm 475 days ago
It's the easiest to setup, but you can get 2x-6x faster with TGI and vLLM depending on the scenario.
1 comments

vllm isn't even hard to setup!

I find it so funny that HN is sitting in the stoneage with LLM inference.

Meanwhile I'm here with sillytavern hooked to my own vllm server, getting crazy fast performance on my models and having a complete suite of tools for using LLMs.

Most folks on here have never heard of sillytavern, or oobabooga, or any of the other projects for LLM UI/UX (LM-studio). It's insanity that there hasn't been someone like ADOBE building a pro/prosumer UI for LLMs yet.