|
|
|
|
|
by fifilura
355 days ago
|
|
I am not familiar with the Itanium story and I don't know who to ask. But it seems to me like this would be a safe space to experiment. With heuristics and pragmas as a fallback. Because with the right approach solutions would mostly be better than not doing anything. And you could do it in runtime when you know the size of the input. And what about applying the logic to places where you can see that the loop will end? I believe query planners in for example Trino/BigQuery do this already? |
|
https://en.wikipedia.org/wiki/Itanium#Market_reception
> A compiler must attempt to find valid combinations of instructions that can be executed at the same time, effectively performing the instruction scheduling that conventional superscalar processors must do in hardware at runtime.
> In an interview, Donald Knuth said "The Itanium approach...was supposed to be so terrific—until it turned out that the wished-for compilers were basically impossible to write."[222]
There were lots of people but the gist of it is that scheduling auto-optimization is not really possible to do statically at compile time and too much overhead to attempt to do generically at run time. In this case it was instruction parallelism, but the same fundamental issues apply to this kind of auto loop idea.
> I believe query planners in for example Trino/BigQuery do this already?
That is completely different from trying it on every for loop in a program. They do it at the broader task level to determine how much parallelism will help.