In this case, the 2-slot RTX 6000 consumes 300 W whereas the "nerfed" 3.5-slot 4090 can draw 450 W.
So I don't think the nerfing here was to lower power consumption. It's just market segmentation to extract maximum $$$$ from ML workloads.
nvidia have always been pretty open about this stuff - they have EULA terms saying the GeForce drivers can't be used in data centres, software features like virtual GPUs that are only available on certain cards, difficult cooling that makes it hard to put several cards into the same case, awkward product lifecycles, contracts with server builders not to put gaming GPUs into workstations or servers, removal of nvlink, and so on.
I didn’t say they don’t do artificial segmentation. I just noted that, in this case, it might have an upside for the user. There might also be some binning involved- maybe the parts failed as A300 parts.
So I don't think the nerfing here was to lower power consumption. It's just market segmentation to extract maximum $$$$ from ML workloads.
nvidia have always been pretty open about this stuff - they have EULA terms saying the GeForce drivers can't be used in data centres, software features like virtual GPUs that are only available on certain cards, difficult cooling that makes it hard to put several cards into the same case, awkward product lifecycles, contracts with server builders not to put gaming GPUs into workstations or servers, removal of nvlink, and so on.