|
|
|
|
|
by adgjlsfhk1
1531 days ago
|
|
if you only vectorize the linear algebra, you leave performance on the table. Vectorizing fused operations reduces the number of memory passes. Also knowing the sizes (which are chosen at runtime) is necessary to make optimal decisions. |
|