|
|
|
|
|
by dnautics
2683 days ago
|
|
> The compiler has to make static decisions. Pretty sure I mentioned JITting in my comment. > My intuition says software can't beat the speed of a superscalar OOO CPU anymore How good is good enough? I mean we have distributed tensor flow which is basically on the fly compilation that can reorganize your computational graph around nodes with gpus separated by network latency, or Julia where you can drop in a GPUarray as a datatype and move computation to the GPU without changing your code. If we go to something a bit more baroque, java is within 1.5 of c/c++ these days Could you hand roll a better solution? Probably. Would it be worth it? Doubtful. |
|