Hacker News new | ask | show | jobs
by seemaze 61 days ago
I squeeze Qwen3.5-122B-A10B at Q6 into 128GB. It's a great model.
1 comments

Wow what kind of hardware do you have? Mac Studio, dgx spark, strix halo? How fast is it?
Strix Halo, I'm seeing performance inline with these results[0].

I'm interested to investigate the claimed gains from the lemonade-sdk port of Apple MLX inference[1].

[0]https://kyuz0.github.io/amd-strix-halo-toolboxes/

[1]https://github.com/lemonade-sdk/lemonade/issues/1642