| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ChuckMcM 4409 days ago

Interesting, I kind of expected AMD to be first out the gate with one.

From : http://www.zdnet.com/applied-micro-canonical-claim-the-first...

"The X-Gene is an ARMv8 64-bit Server-on-a-Chip package running at up to 2.4GHz. It combines 10/40 Gigabit mixed signal I/O with what AMCC calls an enterprise-class memory subsystem. Compared to x86 architectures, AMCC claims that it delivers four-times the processor density while using less than 50 percent of the power and delivering comparable-to-better overall performance."

The picture shows multiple cores, but not how many. What struck me though is that they are pitching "density and less power", presumably they mean you can put twice as many of these servers in the same power foot print that you currently put x86 machines and get 8x (4x * 2) the computing power.

In case you are wondering, that is a 'super computer' pitch, it tickles the pain points of these arrays of super computers, but sadly it does not hit the 'web services' pain points. I'd love it if they said, 1TB of ECC protected RAM, dual 10G ethernet, and 32 full speed independent 6GB SATA channels on each server unit. That would help me make a more responsive web infrastructure.

5 comments

reitzensteinm 4409 days ago

These comparisons (from the quote) often come up with ARM vs x86, but they're always comparing apples and oranges. Yes, you can fit an ARM core with 1/4 the performance in, say, 1/16 the power envelope, but you can do that with x86 cores as well.

The cores we've got are still heavily optimized for single thread performance, because that's what the market demands. Even embarrassingly parallel domains like web serving have increased latency, and Amdahl's law is a problem for most real world programs. The penalty paid for this is high, due to diminishing returns due to caches, complex architectures, and (roughly quadratic) power increases due to higher clock speed.

Ignoring the obvious ARM versus Atom low end comparison, Intel is readying Knight's Landing, their 2015 Xeon Phi product. It's going to include 72 Atom cores, with 288 threads, and be able to be socketed in standard Xeon motherboards.

That's 1152 Atom threads on one commodity(ish) server (and the first time I've ever used a calculator to determine the thread count of a server), and it looks a hell of a lot like the picture being painted by many ARM vendors. Many lightweight cores, great power efficiency and peak performance, and most likely not necessarily suited to latency sensitive tasks.

I don't think it's about ARM versus x86 at all, it's more about a new strategy for computation. Intel can't transition its traditional x86 processors over to the new model, so they're starting to make Atom a first class Xeon product (both with low power standard Atom chips already released, and the Xeon Phi).

The market will shift the other way too - we're going to see ARM processors from multiple vendors that rival at least AMD's x86 offerings in terms of performance per core. And power consumption, too.

After that, there'll be products on both ends of the spectrum from both architectures, and then maybe we can start to move past the false dichotomy of ARM having efficiency and x86 having performance.

link

Quequau 4409 days ago

I'm not sure it's accurate to compare the Knight's Landing, Xeon Phi with ARM 64 in this way. The focus on vector compute capability on the Phi makes it much, much different than just a bunch of Atom cores on one die.

link

gonzo 4409 days ago

Knights Landing will be built using up to 72 Airmont (Atom) cores with four threads per core.

Not 1152 threads, 288. 4 sockets -> 1152 threads.

link

reitzensteinm 4409 days ago

I think you may have misread, my 1152 figure was per server, not socket.

link

justizin 4409 days ago

That's not necessarily true, you don't need 10G ethernet to most nodes in practice, and you don't typically need dual unless you are trying to run active/passive. smaller footprint nodes like this often are, themselves, the failure unit.

most workloads don't need much storage except on the DB end, app servers and esp things like memcache already are run on large non-x86 tiers. Facebook, notably, uses Tilera machines for memcache last I heard.

A TB of RAM, sure, but it's a bit greedy to say nothing less than that is of much use, when 128GB of RAM in a single machine is a relatively recent advancement.

Also note: these are development kits. This is presumably to begin targeting ARM, make sure your code runs reliably. Anything you can do with a TB of RAM and 32 full speed SATA channels you can develop on 4GB of RAM with one drive, as Every_Linux_Box_Ever has proven.

link

ChuckMcM 4408 days ago

I don't disagree, but will share a bit of my reasoning with you.

Two network ports. This takes the top of rack switch out of the picture in terms of failure zones. If you're using 64 port switches, having one drop dead is annoying in that it takes out a disproportionate share of the other resource.

Agreed that they don't have to be 10G, 1G minimum. There a funny story about the product guys at BigCorp saying 100MB was ok, and platform-networking folks arguing for GB. Needless to say the improvement when moving to GB was much better than 10x because the product guys were not thinking cross traffic loads clearly.

Storage distributed to the cloud is much much more effective than storage in a pod. Amazon is kindof figuring this out, Google figured it out about 10 years ago. Once you get to that point you realize you can push computation into the storage and that not only gives you resiliency in the face of module failure it make return from disaster faster (something the Joyent people might have a better appreciation for at this point).

32 full speed SATA ports give you 3,200 Magnetic IOP/s and 25,000 Silicon IOP/s. There are literally legion the number of things that you cannot do on one drive with at best 250 IOP/s and insufficient cache space.

link

sliverstorm 4409 days ago

AMD is on the way:

http://techreport.com/news/26419/amd-demos-seattle-its-first...

From the sound of things the X-Gene will be available for general retail sooner, but they are both coming soon.

link

drudru11 4408 days ago

I think they would do just fine for certain roles in a large scale internet service. I don't think you need disks on every node.

link

dragontamer 4409 days ago

AMD can't be the "first one out the gate" when the The Boston Virdis by Calexdia came out years ago.

http://www.boston.co.uk/solutions/viridis/default.aspx

AMD however, with its partnership with SeaMicro, is going to be a force to be reckoned with. AMD has experience in the datacenter, while all these other ARM companies don't.

link

trsohmers 4409 days ago

But that was 32 bit ARMv7 Cortex A9 based, shipped in 2013 with 2011/2012 specs, and twice as much than a comparable Intel based system.

And that is why Calxeda is now dead.

Also: AMD doesn't just have a partnership with SeaMicro, they bought them back in 2012.

link

dragontamer 4407 days ago

I know. My point is that the title of "First ARM Server" has already been taken. And frankly, "First" is a dubious advantage.

For ARM systems to be competitive, they must deliver power, performance, and cost-efficiency beyond Intel Xeons (including the new Intel Atom line of Xeons). Otherwise, x86 dominance in the datacenter will continue.

These ARM servers are interesting and all... but honestly, a Dell Poweredge seems to deliver much more at a lower cost. Brand-new Poweredge Racks are as low as $700 (R220), Poweredge Xeons start at $1000.

So what is a fair price for an X-Gene? If its anything more than $400, I'm thinking the old Intel solution will be superior. (Even a Celeron will outperform an ARM)

link