Hacker News new | ask | show | jobs
by greyskull 58 days ago
Thank you for all this, I'll give it a shot. Out of curiosity, are there any resources that sort of spell this out already? i.e., not requiring a comment like this to navigate.

> nothing you can run locally, on that machine anyways, is going to compare with Opus

Definitely not expecting that. Just wanted to find a setup that individuals were content with using a coding harness and a model that is usable locally.

What does your setup look like? Model, harness, etc.

2 comments

Not that I'm aware of. It's kind of like building a PC or a bicycle - you're putting mostly-standardized parts together rather than starting from first principles, but there are so many permutations that you can either use a single known-good configuration or immerse yourself in forums and tinker until you can fit things together yourself. Plus both the inference engines and models are of course moving really fast.

I use Opus 4.7 in Claude Code lol, plus Zed (as a text editor, not a harness). Open-weights models that I can run are for me not useful for multi-turn ("agentic") tasks. I do use Qwen 3.6 for one-off tasks like "write a function to pretty-print this weird data structure" or "explain this config file," and Gemma 4 26B for non-coding tasks like "create a timestamped table of contents from this podcast transcript."

I asked Opus through claude code to set up the best local model fitting my hardware and that worked well for me. I could run Qwen 74B or something at .7 tok/s on my 64GB DDR5 on CPU. Pretty cool. Useful for overnight stuff. (this actually worked, it's actually usable for asking questions).