| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by 1ba9115454 544 days ago
	I can't imagine this setup will get more than 1 token per second. I would love to see Deepseek running on premise with a decent TPS.

2 comments

It says 4.25 TPS in the first para.

Honest mistake. Some people think HN is just a series of short tweets and haven’t realized they are links yet!

It's the modern way. Why read when you can just imagine facts straight out of your own brain.

I agree but also found your comment funny in the context of LLMs. People love getting facts straight out of their models.

4.25 is enough tps for a lot of use cases.

That's still pretty slow, considering there's that "thinking" phase.

True, but 4.25 is the number we all want to know.

You can get 1t/s on a raspberry pi.

this has nothing to do with the full 671B and the ollama models are distilled qwen2.5

I appreciate both of these comments, thank you both.