Hacker News new | ask | show | jobs
by philzook 829 days ago
I think one of the biggest problems is even stating what the solution an optimal compiler would be. How does one model a program or a CPU? There almost certainly isn't one perfect model for all situations. CPU's are extremely complicated despite the surface simplicity of assembly. Naive notions of registers or instruction ordering don't really exist. But also because of this, perfect compilation isn't that crucial. The CPU kind of JITs for you. The only objective function that seems pretty clear to me is code size.
3 comments

Maybe we should just make our ISAs be memory-memory, and let the chips map memory addresses onto its hundreds of registers internally, serving as a L0 cache of a sort: after all, we the programmers don't manage L1/L2/L3 caches manually, do we? The CPUs do it transparently for us.
right yea, modern CPUs seem to have all sorts of abstract optimizations. its kind of strange how we meet in the middle with hardware ISA manufacturers. they do all sorts of tricks on their side to make code go faster, and we try to generate machine code that we think will go faster, but neither side works with the other (compiler writers and ISA developers). i bet there is easy low hanging fruit here.
There sort of is, but isn't that why stuff like VLIW exists? So that the compiler can optimize the hell out of the machine code, and the CPU doesn't have to do its own OoO nonsense?
VLIW is what the Unison project (the context behind this post) was attacking. https://unison-code.github.io/ My best understanding is that exposing the details of the CPU to the compiler via a more complex ISA has not worked that well. Possibly because the compilation problem just gets too hard, but maybe also for momeentum and ecosystem reasons. I think another lesson is that it is unwise to make your isa expose microarchitecture because microarchitecture changes quite a bit between generations and inidividual CPUs. Things like delay slots seem like misfeatures now.
And it's worth noting VLIW survived longer in GPUs—where there isn't any expectation of ISA stability, which avoids that problem.
And, perhaps also relevantly, your compute kernel probably fits in iCache, even with VLIW instructions.
Not to be confused with the Unison language, I guess. https://www.unison-lang.org
interesting that code size is important to you. hey everyone, write shorter programs :)
I do like short programs :). Code size does matter for instruction cache. In my application in particular, post hoc binary patching, a single byte can make a big difference in difficulty of patching.