Hacker News new | ask | show | jobs
by zellyn 423 days ago
Weird to give MacBook Pro specs and omit RAM. Or did I miss it somehow? That's one of the most important factors.
2 comments

Using a 7B model on a M2 Max also isn’t quite the most impressive way to locally run an LLM. Why not use QwQ-32 and let it give some commercial non-reasoning models a run for their money?
Exactly. You want to come close to maxing out your RAM for model+context. I've run Gemma on a 64GB M1 and it was pretty okay, although that was before the Quantization-Aware Training version released last week, so it might be even better now.
Thanks for calling that out. It was 32GB. I updated the post as well.