Hacker News new | ask | show | jobs
by anonymous_union 829 days ago
right yea, modern CPUs seem to have all sorts of abstract optimizations. its kind of strange how we meet in the middle with hardware ISA manufacturers. they do all sorts of tricks on their side to make code go faster, and we try to generate machine code that we think will go faster, but neither side works with the other (compiler writers and ISA developers). i bet there is easy low hanging fruit here.
1 comments

There sort of is, but isn't that why stuff like VLIW exists? So that the compiler can optimize the hell out of the machine code, and the CPU doesn't have to do its own OoO nonsense?
VLIW is what the Unison project (the context behind this post) was attacking. https://unison-code.github.io/ My best understanding is that exposing the details of the CPU to the compiler via a more complex ISA has not worked that well. Possibly because the compilation problem just gets too hard, but maybe also for momeentum and ecosystem reasons. I think another lesson is that it is unwise to make your isa expose microarchitecture because microarchitecture changes quite a bit between generations and inidividual CPUs. Things like delay slots seem like misfeatures now.
And it's worth noting VLIW survived longer in GPUs—where there isn't any expectation of ISA stability, which avoids that problem.
And, perhaps also relevantly, your compute kernel probably fits in iCache, even with VLIW instructions.
Not to be confused with the Unison language, I guess. https://www.unison-lang.org