| HN Mirror

I inherited a few R420's from my work, and the coolest thing about it is the iDRAC. I'm not a SRE or anything of the sort, so don't get to see stuff like that much, but the utility of the iDRAC is fantastic.

The machines each have 192GB of ram, so I thought I'd set them up as LLM inference machines. I figured that with that much ram, I could load just about any model.

Then I discovered how slow the CPUs on these older machines is. It was so utterly slow. I have a machine I bought from Costco a few years ago that was under $1k and came with a RTX 3060 with 12GB of GPU ram. That machine can run around 20+/tokens per second on 13B models (I actually don't know - I stream the text, and cap it at 9 tokens per second so I can actually read it).

The R420? Its tokens per second were in the 0.005 to 0.01 range.

So, yeah, not a good CPU for that sort of task. For other stuff, sure. I thought I'd setup a small file server with one instead, but the fans are so jet engine loud that it's intolerable to have in any part of the house, even when managing fan speeds with software.