| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jazzyjackson 593 days ago
	I'm returning my 96GB m2 max. It can run unquantized llama 3.3 70B but tokens per second is slow as molasses and still I couldn't find any use for it, just kept going back to perplexity when I actually needed to find an answer to something.

1 comments

Interesting. You're using the FP8 version i'm guessing? How many tokens/s are you using and which software? MLX?