Hacker News new | ask | show | jobs
by hajile 1491 days ago
If your assertion had any weight at all, EPIC would have taken over.

> Compilers can (and do) reorder instructions to extract as much parallelism as possible. Further, SIMD has forced most compilers down a path of figuring out how to parallelize, at the instruction level, the processing of data.

Peephole optimizations are literally just rewrite rules and very limited in what they can accomplish, but we can't find an even moderately reliable way to optimize larger bits of the program. Auto-vectorization is still so bad that even unskilled devs can probably do a better job by hand.

> Further, most CPUs now-a-days are doing instruction reordering to try and extract as much instruction level parallelism out as possible.

This is true and proves my point rather than yours. If the compiler could do the job, then the VLIW output would be faster and not require OoO execution. It's telling that the fastest versions of Itanium were the ones that took the incoming VLIW commands and ripped them apart into a traditional OoO instruction window effectively negating the whole idea while preserving the externally-facing ISA.

> Figuring out what instructions can be run in parallel is a data dependency problem, one that compilers have been solving for years.

If they solved it years ago, then why do we get such MASSIVE ILP boosts from bigger instruction windows? Why is 2-3 instructions of throughput the maximum efficiency we can get from in-order systems?