Hacker News new | ask | show | jobs
by microtonal 5079 days ago
But the shiny object that has my attention at present consists of low-voltage ARM-type chips running on tiny inexpensive systems that can be stacked together to do all kinds of interesting things for a fraction of the power my Intel Xeon uses

This sentiment is repeated very often, but has someone actually done the math (in the case you do temporarily need a lot of processing power)? E.g., the following post estimates the power use of a Raspberry Pi around 2W:

http://www.raspberrypi.org/phpBB3/viewtopic.php?f=2&t=60...

A recent Xeon or Core i7 is many times faster, and has the advantage of providing shared-memory parallelism (as opposed to a cluster of Pi's, where you have to distribute work over a 100MBit network).

Also, when he wants to save power, he shouldn't use a Xeon. Intel Core mobile CPUs, draw a relatively small amount of power as well. E.g. last time I measured my Mac Mini, it used 12W during normal use. And it's actually a usable desktop machine, in contrast to the Raspberry Pi.

1 comments

For power usage, a model like the one used by Parallax's Propeller microcontroller might be interesting: Propeller has 8 cores. The entire thing can be clocked up an down like on a modern x86 chip. But it's also possible to put cores to sleep individually, which reduces power consumption even further.

I'm not sure how good an example Raspberry Pi is, simply because it includes features like a GPU. A computer probably really only needs one of most of that kind of stuff, so it would make more sense to look at the power requirements for a bare ARM chip than for a complete computer built around it.

-- change of subject --

What I've been running into (including hard and repeatedly over the last few days) is that parallelizing workstation-end tasks without killing performance in the process is hard. Shared-memory parallelism in particular is painful, because with too many cooks in the kitchen they'll end up spend more time trying to not pour boiling water on each other than making food. For example, every time you hit a memory barrier all the cores that are working against that memory need to stop and consult the L3 cache or, worse yet, main memory. That introduces an enormous stall (Of course if you're in a situation where non-trivial parallelizing is worth the effort, any stall feels enormous.), so it needs to be avoided as much as possible. . . which tends to not be an easy thing to do if you're doing shared-memory parallelism. Because if it were trivial, then you'd probably have been able to get away with shared-nothing.

Now the "lots of tiny cores" approach gets more interesting when you can get away with a more shared-nothing approach like what the article suggests. But it comes at a big cost, which is that you're going to take a massive hit on the kind of performance you can get on tasks for which parallelism is infeasible, or for which you don't have any programmers who are good enough at parallelization to do it (effectively the same thing). In those situations, you're going to be stuck watching one lone core play "Little Engine that Could" while all the other cores are dozing off like the lazy bums they are.

Meanwhile it solves a problem that I'm not convinced really exists. Time-multiplexing relatively beefy CPUs is pretty much a solved problem. Less so if you need real-time, but for everyday use there's really not much need to segregate processes to different cores when pre-emptive multitasking has been around on consumer systems for decades.