| I also recently decided to buy a datacenter GPU and slap it into a system. Some notes from my experience that the author doesn't mention in their article: Decommissioned NVIDIA V100s and AMD MI50s are fairly cheap, $200 for 16gb and $400-500 for 32gb, for local experimentation. They are also very old. There's an enthusiast community keeping these two cards alive and working with current platforms and models. Nitpick, but the V100 doesn't support bfloat16. The performance hit is not a big deal if you're fiddling with local models, but the card is on it's way out in terms of hardware features. The MI50 does support bf16, but not the current edition of AMD ROCm. Vulkan support is good and the MI50 works with most major platforms (llama.cpp, vllm, etc.), but it's not without some pain points like manual recompilation. Fortunately the open source community has already paid most of your way. The cooling requirements for these cards cannot be understated. A consumer grade GPU may throttle if in a small case without additional fans, but if given the same treatment a datacenter GPU will overheat itself idling. You will need to buy, at least, a bunch of decent 120mm fans to prevent this or invest in some water cooling. I ultimately went with an AMD MI100 32GB ($950). I'm an AMD fan, current ROCm editions support it, and it was low-fuss to get things working. I'm debating getting a second so I can try out bigger models like qwen3-coder-next. |
There's a cottage industry of 3D-printed fan-shrouds for data center GPUs - 120mm are often the sweet spot for quietness and practicality. The shoud smugly fits the GPUs intake, so it gets all the airflow from the attached fan(s), whose speed curves can be attached to GPU temperature.