|
|
|
|
|
by Gladdyu
1898 days ago
|
|
The difference is that software compilation is a fairly local optimization process - allowing you to change/inline one bit of generated code without significantly affecting the performance of the rest of the generated code - address space is cheap. On the other hand, especially when the fpga approaches capacity, one change could cause a significant portion of the design to have to be rerouted as space is limited causing significant performance differences or not meeting the timing requirements anymore. |
|
Not really.
The uop cache, and L1 code cache, of modern chips is rather small. You can often grossly increase performance locally by loop-unrolling, but if that causes the "hot path" to no longer fit in uop-cache (or L1 cache), then you've lost a chunk of global performance vs a small local-gain in performance.
Global vs local optimization is just a tough subject in general. Even on CPUs (which is probably easier than FPGAs)