|
|
|
|
|
by pyrolistical
572 days ago
|
|
I wish hardware exposed an api that allowed us to submit a tree of instructions so the hardware doesn’t need figure out which instructions are independent. Lots of this kind of work can be done during compilation but cannot be communicated to hardware due to code being linear |
|
There is an argument that today's compilers are finally good enough for VLIW to go mainstream, but good luck convincing anyone in today's market to go for it.
------
A big problem with VLIW is that it's impossible to predict L1, L2, L3 or DRAM access. Meaning all loads/stores are impossible to schedule by the compiler.
NVidia has interesting barriers that get compiled into its SASS (a level lower than PTX assembly). These barriers seem to allow the compiler to assist in the dependency management process but ultimately still require a decoder in the NVidia core final level before execution.