Hacker News new | ask | show | jobs
by sosodev 85 days ago
Most people are using something in the llama family for inference. Llama server is my go to. Unsloth guides describe how to configure inference for your model of choice.