Hacker News new | ask | show | jobs
by ThunderSizzle 19 days ago
I added an R9700 32GB to my 10+ year old desktop that had a 980 4GB card in it, for a grand total of $1350 or so. The payoff compared to what I was using with GHCP was 33 months, but when GHCP announced their price increase, it basically became a 3 month payoff at minimum (so yes, GHCP did a 10x price increase for non-parallel agentic workflows)

I can easily run Qwen3.6 35B-A3B with Q5_K_M with a 260k+ context window with some vram to spare. It easily runs probably 80tps. It took me quite a while to find the

Compared to GHCP Claude Sonnet 4.5 or 4.6, I have full parity. The wall clock time is faster for agentic workflows, and rule following is about on par.

With either, doing something kind of novel or obscure takes more hand holding compared to just generate a GUI or crud app. For example, trying to build an actual program that performs a complicated process correctly requires quite a bit of hand holding to get it to properly help.

Sure, it isn't Opus or something, but I think with the right harness, it probably can get close. I think most of the issues these days is the harnesses are lacking.

3 comments

What is GHCP in this context? Glasgow Haskell Compiler Platform? Google Hostage Computer Program?
GitHub Copilot. It was one of the best values around in terms of cheap LLM access since each prompt was basically 4 cents (more or less), no matter how much it would do or how many tokens it used. A simple "Proceed" prompt that was telling the agent to execute a sophisticated plan could burn a lot of time without needing any direct intervention by the user, but as of June 1st, they switched to metered billing, meaning each token in/out has a cost now.

It was suspected to come soon enough, but it was a nice cheap road for my small hobby stuff. When they announced the price changes, I started to explore alternatives, and with the news of Qwen3.6 35B being both and having quality, I figured it was worth a try out, and self-hosting made the most sense to me, since that meant I was free from being a forever-renter.

And when you had a tool call that asked the user for the next step, you could easily run a whole day with 4c. Guess how the people did 5k $ worth of token with 100$ spent.
Github Copilot, as far as I can tell. Though I like yours.
Between that and the Arc Pro B70, they’re the 32gb cards that are actually affordable and worth getting.

I’ve got both (single R9700, dual B70) and they do nicely for about anything I throw at them, such that the latter has a visible improvement when the model is well-cached.

Can you give some context of how you are getting both of those to work? I am guessing vulkan. Did you face any pain during integration? I am planning to add R9700 to my 5070ti and my only concern is if vulkan wouldn't be able to do the heavy lifiting.
My 980 is currently unpowered. I have not tried to integrate them yet. It's on my to-do list, but the first time they were both powered on, the system had booting issues, and I didn't want to care at the time, since the 980 was probably going to be idle 99% of the time anyway.

I'll probably try to figure that problem out in about a month. Worst case is I move it to another even older desktop to replace the 9800 GTX+ inside of that one.