Hacker News new | ask | show | jobs
by aidenn0 1616 days ago
> A classic example is Itanium: it is only a historical footnote today, but Itanium’s explicit parallelism and focus on scalability once made it look like the future of CPUs. The problem was never the hardware itself—it was difficult compilation and backward compatibility with the x86 software ecosystem that doomed Itanium.

The problem with the Itanium was the hardware itself. Finding sufficient ILP on general purpose loads for a VLIW like Itanium is an unsolved problem in compiler design. Saying the problem with Itanium was software would be like entering a drag-racer in formula 1 and saying the problem was that the drivers weren't good enough at steering.

1 comments

There are a lot of important numerical algorithms which would have really benefited if Itanium had gone through iteration and growth. A mainstream VLIW could've had it's place, and it's trivial to find parallelism in FFTs, SVDs, matrix multiplies, and so on.

To me, there is a spectrum of parallelism on the desktop:

    multi-server,
    multi-process,
    multi-threaded (shared mem),
    <Itanium would go here>,
    SIMD instructions
Yeah, Itanium might have required assembly to exercise that niche, and maybe new programming languages would've come about. There has to be some middle ground between Verilog/VHDL and C, right? Maybe a CUDA-like language could've done the trick (it certainly works for GPUs).

I think it's a shame Itanium failed, and I think it failed for the wrong reasons. At the time, I remember everyone criticizing it for not running legacy x86 applications very well. As though word processor, spreadsheet, and presentation software wasn't fast enough. Saying legacy apps in existing languages don't make it easy to find the ILP seems like a slight generalization of that.

The AMD64 ISA (which is what really killed Itanium) was a blessing and a curse. It made x86 just better enough to not be awful, but it killed desktop/server alternatives for at least 25 years. Maybe ARM will make inroads, but it isn't that much better either.

> There are a lot of important numerical algorithms which would have really benefited if Itanium had gone through iteration and growth. A mainstream VLIW could've had it's place, and it's trivial to find parallelism in FFTs, SVDs, matrix multiplies, and so on.

DSPs (which have great perf/watt for the numerical algorithms you mention) have used VLIW for decades, so of course there is a place for it. GPUs have moved in for all of those operations at this point though. The bet with Itanium was that compilers could be made sufficiently smart to make VLIW work for non-numeric workloads, and that bet failed to pay off. Intel and HP had hundreds of smart people trying to solve the "software problem" of Itanium and they did not succeed.

> I think it's a shame Itanium failed, and I think it failed for the wrong reasons. At the time, I remember everyone criticizing it for not running legacy x86 applications very well. As though word processor, spreadsheet, and presentation software wasn't fast enough. Saying legacy apps in existing languages don't make it easy to find the ILP seems like a slight generalization of that.

Desktop applications is a red-herring given that Itanium was targeted primarily at the workstation and server market. There was also a bad-timing issue as it was at about the same time that PC hardware was displacing dedicated workstations and server hardware.

Yeah, there's a whole other universe of specialized chips for special purposes, and I used PCI-style DSP cards when I could. I just think a standard VLIW on the desktop/server would've been useful for the stuff I'm interested in.

GPUs can definitely carry that load, but I avoided them in my career because I could rarely guarantee that my customer's computers would have a sufficient GPU. In the world where I worked, x86 and AMD64 became standard - I could always count on that. It had to be a pretty special project for my customers to let me dictate a dedicated rack of specific hardware was required.

> Intel and HP had hundreds of smart people trying to solve the "software problem" of Itanium and they did not succeed.

Yeah, but that's tied up in the market too. A big name customer screaming, "But I don't want to retrain my programmers, it has to work with Java/C++" would certainly sway them from a Verilog or Cuda style language. Hell even OpenCL and Cuda have to look like C++. Double hell, the FPGA folks have been trying to make a C++-like language for decades so that they can increase their market. That doesn't mean another possibility couldn't exist for Itanium.

It's very clear that Itanium is dead. Maybe I'm just saying the market was foolish, and you're saying Intel/HP couldn't satisfy the market.

> Intel and HP had hundreds of smart people trying to solve the "software problem" of Itanium and they did not succeed.

I've also heard a contrary story that Intel and HP simply assumed the compilers would show up, or at least failed to put in sufficient effort to advance the industry. I'm curious if you have any sources. I've always wondered what the true story was, though neither need be mutually exclusive.

It would seem foolhardy for Intel and HP not to heavily invest in compiler research given the stakes. OTOH, the norm seems to be for hardware vendors to suck at deliberately building and evolving software ecosystems around their hardware, especially as commodity hardware and open source software became ubiquitous. And "sufficient effort" is definitely a matter of opinion.

By way of example, early examples of polyhedral compilation go back to the 1990s, but it wasn't until the 2010s that implementations shipped in GCC and clang, long after Itanium failed. I doubt it would have saved Itanium, but I would have expected to see such contributions earlier and coming directly from Intel and HP. But maybe my expectations are too high.

The Intel C compiler was already well established as a top IA-32 compiler by the late 90s (prior to any IA-64 release). This article[1] from 1999 assumes Intel is responsible for the compiler. My recollection is that the primary focus on 3rd party software was getting systems software ported.

I don't think Intel was banking on 3rd parties making compilers. A lot of 32-bit architectures not named "68000" from the 80s/early 90s suffered from poor first-party compilers and a lack of good 3rd party compiler support; in 1980 an optimizing compiler was not considered an important part of a microprocesor's ecosystem, but by the time IA-64 came around the importance was fairly well understood by hardware vendors. Given the quality of the first-party IA-32 compilers, I think Intel (and everyone else) expected that the first-party IA-64 compilers would be good.

Certainly by the release of Merced (and likely well before), compiler engineers internal to Intel were aware of how hard it was to codegen for IA-64. Certainly during the time period that Intel was pushing IA-64, they had an insatiable desire for compiler developers with advanced degrees.

1: https://www.cnet.com/news/intels-merced-chip-may-slip-furthe...

I am way out of my depth here, but wouldn’t a machine code to machine code JIT compiler solve the problem of underutilization of itanium? (I remember reading a paper on x86->x86 jit compiler as well that could provide some speed up)

If so complex branch prediction and pipelining can be done in hardware alone, much more clever (and patchable!) optimizations can be done in software, or I would think so. So while mainstream languages may not be able to reuse Itanium’s architecture efficiently at compile time, a separate program could reorder instructions to make use of some instruction level parallelism, couldn’t it?

Also, it’s worth noting that OoO brings more than just ILP/scheduling, it also brings MLP and dynamism. Take for instance latency hiding a cache miss or a mispredicted branch. Stuff like this is impossible to know in advance, no matter how much you redesign your language to expose ILP.
>A mainstream VLIW could've had it's place, and it's trivial to find parallelism in FFTs, SVDs, matrix multiplies, and so on.

There are already DSPs for this purpose, but typical server workloads don't generally use those algorithms. Perhaps Itanium would have made a good DSP but it wasn't really aimed at that market.

> There are already DSPs for this purpose

I should've been more clear: Most open source projects, or my projects for the customers I used to have, can't/couldn't rely on a DSP chip or card being installed. If Itanium had gone mainstream, I could've counted on it's VLIW instructions.

We can /almost/ count on a GPU nowadays, but programming in Cuda ties you to NVidia, and OpenCL doesn't seem to have taken off the same way.

> Perhaps Itanium would have made a good DSP but it wasn't really aimed at that market

I suspect there are a lot of FFTs, SVDs, and large matrix multiplies in software now. Deep learning, convolutional nets, image and audio algorithms, TikTok "filters", and so on. Of course there was almost none of that on desktops in the late 90s.

> I should've been more clear: Most open source projects, or my projects for the customers I used to have, can't/couldn't rely on a DSP chip or card being installed. If Itanium had gone mainstream, I could've counted on it's VLIW instructions.

So to sum up: you can't convince customers to buy special hardware and neither could HP/Intel?

> So to sum up: <snarky shit reply>

I wish it was possible to have a discussion that wasn't about who could get the best zinger in to burn the other person. This isn't Reddit, and you aren't in high school.

I agree my response was snarky, I disagree it was shit. I wasn't looking for a "sick burn" sort of reaction.

Here's a slightly longer and more boring version of what I posted:

Itanium came out at the tail-end of a long movement from special-purpose to commodity hardware; servers and workstations were moving from 68k/MIPS/Sparc to PC-based hardware. It was a DSP that ran general-purpose loads "okay" when most people were looking for a general-purpose CPU that ran DSP type loads "okay" (i.e. the various SIMD extensions to x86 and POWER).

Anything that starts with "If Itanium had gone mainstream" is a counterfactual. Maybe it would have delayed GPGPU as the performance advantage of programmable shaders over running on CPU would have been smaller and maybe without AMD's competition, it would have allowed Intel to keep bus-speeds lower for longer.

My original point that Itanium was a failure to deliver the hardware people wanted rather than the failure of software to appear on said hardware stands.

Itanium got maximum penetration in HPC, so people were aware of this. The challenge is that GPUs and DSPs (many are VLIW) are even better at parallel.