Hacker News new | ask | show | jobs
by 5d8767c68926 1385 days ago
>A lot of people are undervolting their RTX GPU's because for an only about a ~3% performance loss you get about 10C less temp which translates to far less fan noise

Bah, this is brilliant. I just upgraded a 1070 to a 3070 and am flabbergasted at how much heat it dumps into my room. One of the reasons I did not go with the 3080 was the ~100 watt lower draw.

Do you know of any good tooling to assess the impact of undervolting or is it a manual guess-and-check process?

5 comments

Trial and error. You need to dial in the right point on the voltage/clock frequency curve for your workloads, AKA "just play some games and look at the results." Just use whatever your overclocking software for your motherboard is, and modify the default curve it has. I use MSI Afterburner and just set a flat clock frequency (plateau) at a certain voltage level to undervolt. I think for NVidia GPUs there's a way to modify the curve with the default tooling, but third party tools like Afterburner can also do it.

You can get great results pretty fast this way. My Mini-ITX build is about as thermally compact as possible given the parts (3080+Ryzen 5600X, NZXT H1), and I'm pushing my PSU to the absolute limits in the stock settings, so undervolting is important for safe power margins since the 3080 can reach ~360W in my testing. I think 30 minutes of tweaking got me something like a +80W power drop for only 10% FPS in Read Dead Redemption 2 @ 4k60fps; I never breach 300W now which is within my personal safety margins, and can native 4k everything.

Some software like Afterburner have "Overclock Scanner" tools that will run benchmarks and repeatedly try to dial these settings in for you, but it really is easier to just modify the curve manually and test your specific workloads.

I just built a Ryzen 5600G system (without a discrete video card atm) and you can set either temperature or power consumption limits in the BIOS and it will underclock itself (actually turbo boost less) until it obeys your limits.

Perhaps I'll wait with the video card until they give me the option to do the same there...

Make sure to also cap your FPS or use Vsync. No point pumping out 100fps when you have only a 60hz TV, etc.
This is the correct answer to tackle power draw. Use Vsync/Adaptive Sync for fixed refresh monitors, or FreeSync/GSync for variable refresh monitors.

For variable refresh rate monitors, it's best to use framerate limiters as well: either in-game or in the Nvidia control panel. Set the cap at least a few fps lower than your monitor's max refresh rate. Even better, aim for 90-100 fps cap, beyond which diminishing returns kick in and power bills continue to creep up.

Just use MSI Afterburner and do some tests. I also usually setup a fan curve where the fan always runs faster than default to keep the temps lower.
i use prime95 for cpus and msi kombustor for gpus. if they can run for a while without errors i keep my settings, otherwise i increase power/voltage and try again
prime95 isn't a very good test anymore. With the changeover from blend to smallfft, it doesn't test the frontend or the memory controller or any of the other parts of the CPU very well anymore, it loads the kernel into instruction cache once and then it just slams the AVX units as hard as it can.

so not only does this not test the rest of the cpu at all - meaning you can run into problems with other parts of the CPU that aren't stable at those frequencies, because they're not being tested because it's only running the AVX units - but it also doesn't test frequency/power state changes at all, so you can run into situations where as soon as you close prime95 and it drops to a lower p-state, it'll crash.

gpus have run into similar things with furmark and kombuster and other power-virus tests... actually the GPUs themselves will detect when they're running and throttle down, so they no longer even do the thing they're supposed to, but, gpus also change power/frequency states under real-world workloads, just like CPUs, and they don't under furmark/kombuster. this actually caused a crisis at the ampere launch... all the testing had been done with a "pre-release bios" that only allowed these sorts of power/thermal testing, and it turned out that while the chips might be stable at max p-state, they weren't stable when they shifted back to a lower p-state, or from a lower p-state back to maximum. That was the whole "POSCAP vs MLCC" thing.

prime95 and furmark were very very popular 10 years ago but that's where they belong, they don't do the job anymore these days.