Hacker News new | ask | show | jobs
by iam 4760 days ago
Can someone please shed some light on this relatively recent APU race? I understand it seems like a good idea for OpenCL, but when they increase their APU performance while actually stagnating their CPU performance I am just completely lost (as is the case with the i7-4770K).

What is the market for desktop CPUs where the desktop buyers don't also have dedicated video cards which can do a almost a magnitude better job than the APU?

4 comments

>What is the market for desktop CPUs where the desktop buyers don't also have dedicated video cards which can do a almost a magnitude better job than the APU?

Small form factors, like the iMac or Mac mini. Apple is pretty influential in Intel's roadmap nowadays. See also how the next-gen mobile chipsets are getting more powerful onboard GPUs to drive Retina displays.

The NUC (next-unit-of-computing) buzz is cool for robotics too.
I hadn't seen that. There are lots of research oriented robotics projects that use Apple Mac Minis for compute power. You just bolt on a Mini and you have a standard environment to run computer vision, learning, or advanced path planning, communicating with the robot's own computers by UDP, for example. NUC could be a nice intermediate point between the Mini and full integration with the robot's own computers.
NUC is nice, I measure mine at less than 10 watt for "smash the CPU" tasks, small form factor, good alternative to ridiculously expensive options like car PCs. I found it's also voltage tolerant. It's rated at 19v but you can run it at much less (down to 15).

EDIT: it does go up to nearly 20 watts, depending, esp if you have the HDMI plugged in. Technically I think you can get more out of a NUC than a mini in terms of raw MIPs. They're PCs, not toys like RPI.

Thanks for the info. For the robotics use case I mentioned, having a high-performance system is important because the algos are demanding, so a RPI would not be very good.
GPUs yield a higher core density and are more efficient in certain situations. This appears to be where the arms race is in HPC these days. Every cpu maker is trying to be the first to cross the finish line with a general purpose GPU solution these days(with a traditional CPU strapped to it's back to run the os). Rumor has it that nvidia will have an arm platform soon. I'm sure Intel and AMD are afraid of this. They want to have something in the pipeline that is competitive before this happens.

The way the market is going these days, consumer devices will soon almost all be based around low power/small footprint solutions(e.g. atom, mobile, arm). Heck, it probably won't be too long before we start seeing some serious SOC solutions(what could a raspberry pi type device do in 2 years with a 4 or 6 core arm at it's core). So, that kills margins on the consumer market more or less. Or, well, at least it becomes a race to the bottom. What's left is enterprise and "the cloud", where efficiency and iops still rule. For these companies, CPU speed isn't a bottleneck, it's mostly core and io density. They simply want to run more jobs in a smaller space, not necessarily run the jobs faster. So engineering teams are starting to look very closely at GPUs as a means to get a huge improvement in core density for certain situations at the cost of having to rethink some of the software stack. Putting the GPU on the same die as the CPU makes a lot of sense from this perspective.

It'll be interesting to watch. If nvidia does have an arm platform in the works, they may win this race, though intel does have a huge advantage with being able to run multiple foundries at one. Part of me wants to believe that intel is capable of doing much more with their APU solutions, but are simply waiting until the future of the market is clearer. Once that happens, they may be able to beat everyone else simply by timing the market on newer designs(like how intel beat AMD after screwing up with itanium and P4).

Note: Having a low power SOC type solution is another side effect of all this, but I don't think that's the primary motivator for Intel's efforts here. Maybe for the atoms, but not this chip.

NVIDIA's ARM platform isn't a secret. They announced Project Denver years ago. Their latest roadmap has it scheduled for 2015. The only unknown is how wide a range of power+performance they will be targeting. Will they start by going after phones, tablets, workstations, servers, HPC?
We should be leveraging the APU compute engine in opengl and opencl tasks to get more performance out of the system (maybe not gl, since it took a decade for ATI/AMD and Nvidia to properly support just having 2 of the exact same card work in parallel).

In practice, with the advent of compute shaders and opencl, there is very little intense serial work that can't be done in parallel. My workflow nowadays is (using python as an example)

Slow? (assuming we know why it is slow and it isn't just maligned algorithmic complexity making something a runtime exponential where you can use a quadratic) -> Put it in C++. Still slow? -> Parallelize into tasks and put in a work threadpool (assuming there is a lot of this kind of work happening, else just numcores / X threads do stuff). Still slow? -> Port over to opencl (or if memory copies are unnecessary, opengl compute shaders) and keep the old implementation for backwards compatibility with systems lacking them.

It's not really a desktop processor; it's a laptop processor that's being reused for the desktop market. Gamers might prefer a GT0 configuration but I guess the market isn't large enough to justify it.