Using the Intel Vtune tools you can see how each port is utilized, so you could in theory change your code to mix instructions for best utilization beyond what reordering the CPU can do itself, so I can see some analogy with building a VLIW instruction group.
There's a crazy amount of performance counters you can look at (the perf tool can do that too, but just try running "perf list" to view available counters).
https://en.wikichip.org/wiki/intel/microarchitectures/skylak...
Using the Intel Vtune tools you can see how each port is utilized, so you could in theory change your code to mix instructions for best utilization beyond what reordering the CPU can do itself, so I can see some analogy with building a VLIW instruction group.
There's a crazy amount of performance counters you can look at (the perf tool can do that too, but just try running "perf list" to view available counters).