Hacker News new | ask | show | jobs
by nabilt 881 days ago
Thanks, good to know. I may need to request a 3A 1U in that case.

I didn't know iDRAC could measure real time power usage. Pretty amazing.

1 comments

I inherited a few R420's from my work, and the coolest thing about it is the iDRAC. I'm not a SRE or anything of the sort, so don't get to see stuff like that much, but the utility of the iDRAC is fantastic.

The machines each have 192GB of ram, so I thought I'd set them up as LLM inference machines. I figured that with that much ram, I could load just about any model.

Then I discovered how slow the CPUs on these older machines is. It was so utterly slow. I have a machine I bought from Costco a few years ago that was under $1k and came with a RTX 3060 with 12GB of GPU ram. That machine can run around 20+/tokens per second on 13B models (I actually don't know - I stream the text, and cap it at 9 tokens per second so I can actually read it).

The R420? Its tokens per second were in the 0.005 to 0.01 range.

So, yeah, not a good CPU for that sort of task. For other stuff, sure. I thought I'd setup a small file server with one instead, but the fans are so jet engine loud that it's intolerable to have in any part of the house, even when managing fan speeds with software.

CPUs are slow in comparison to GPUs for lots of tasks. Comparing a 10 year old CPU vs. a 4 year old GPU only make that comparison "more offensive." That said, you pair your R420 with something like a RTX A2000 and you'll have a much fairer fight.
Does an RTX A2000 even fit? I'd consider trying to find a pair if so. It's hard looking at the two machines I have with 192GB each doing nothing.