| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jsn 5347 days ago

From http://www.theregister.co.uk/2011/11/01/hp_redstone_calxeda_... :

The sales pitch for the Redstone systems, says Santeler, is that a half rack of Redstone machines and their external switches implementing 1,600 server nodes has 41 cables, burns 9.9 kilowatts, and costs $1.2m.

A more traditional x86-based cluster doing the same amount of work would only require 400 two-socket Xeon servers, but it would take up 10 racks of space, have 1,600 cables, burn 91 kilowatts, and cost $3.3m.

Hmm, let's see. It's about 7-8 grands per Xeon server, something like HP Proliant DL360R07 (2 x 6-core Xeons at 2.66GHz). It's 3 times as many cores as Redstone, clocked at 2.66 times greater frequency each, and doing more instructions per clock tick, too. And that's without hyperthreading.

Am I missing something big, or is Redstone solution neither cost-effective nor energy-effective?

3 comments

tmurray 5347 days ago

You assume the application is compute limited and that the extra performance on the Xeon translates into extra performance on a given application. That's probably not a good assumption for this kind of workload.

link

jsn 5346 days ago

Why, for embarrassingly parallel workloads (like the ones they mention) it's a totally reasonable assumption. And for something not so parallel the gazillion of ARM nodes is all but useless.

link

easp 5346 days ago

The article mentions Hadoop, big data crunching, web serving and web caching. They may or may not be embarrassingly parallel, but that doesn't mean any of them are typically compute bound.

Look, today's multichip, multicore servers tend to be unbalanced for a lot of workloads. Their massive compute performance often burns power waiting for main memory, disk or network.

link

miratrix 5346 days ago

You're going to be I/O bound (network or disk), memory bound, or compute bound. It's hard to imagine the Redstone systems besting Xeon based servers in any of the three.

link

tmurray 5346 days ago

It depends entirely on where your bottlenecks are. If the bottleneck is entirely within your node, then this isn't going to be compelling. If you're doing something that's very light on the resources within your node (serving static content, etc) and your bottleneck is some other system somewhere else, then these sorts of machines could be compelling purely from a space/power POV.

link

jsn 5346 days ago

If your nodes are not bound on some local resource, you can as well just run them in virtualization containers on Xeon. The setup will be even more flexible than with (less powerful) ARMs.

link

ricardobeat 5346 days ago

But not nearly as space/power-efficient.

link

easp 5346 days ago

If your workload runs on one or two Xeon servers, it probably isn't worth considering something like this. If your workload runs on racks of Xeon servers, it might be.

Then the question is, which hardware delivers the right balance of CPU, memory and IO bandwidth for the lowest capital and operating costs.

Also for what it is worth, each card has 60Gbps of general IO bandwidth, and another 48Gbps of SATA disk bandwidth.

link

ricardobeat 5346 days ago

Even if you triple the number of Redstone machines, you'll still use just ~30% of the energy and 7.5% of the cabling.

And each 4 ARM cores have their own memory channels and I/O ports, vs every 6-12 on the Xeon [corrected] (point being that CPU speed is not the only variable here).

link

wmf 5346 days ago

The Calxeda chip is quad-core, so there's still sharing.

link

ricardobeat 5346 days ago

My bad. The tray picture shows 36 boards, I didn't pay much attention and thought those were 72 single-core nodes.

link

wmf 5346 days ago

By my calculations the Redstone config has 6400 cores and the traditional one has 4800 cores. But discussing such vague claims is pretty pointless anyway.

link

jsn 5346 days ago

The original Calxeda reference design from last year was a 2U rack-mounted chassis that crammed 120 processors (and hence server nodes)

also:

HP can cram three rows of these [4 CPU --jsn] ARM boards, with six per row, for a total of 72 server nodes

From that I conclude that in their calculations 1 CPU == 1 server.

link

tmurray 5346 days ago

Each CPU is a separate node in this configuration--separate DRAM, IO, etc.

link