| HN Mirror

Agrees with my own experience. I have a 4070 super which of course is nothing to brag about, but tps using quantized 27b model is miserable. I could go down to 12b or even smaller, but it would sacrifice in quality. Then I could upgrade my gears, but I realize that however much I spend, the experience is not going to be as smooth as off-the-shelf LLM products, and definitely not worth the cost.

Of course it is nice to have an LLM running locally where nobody gets to know your data. But I don't see a point in me spending thousands of $ to do that.