|
|
|
|
|
by ekelsen
377 days ago
|
|
I looked at the softmax kernel and the cast that it does from a float* to a float4* is extremely brittle -- it's trivial to break by offsetting the input slightly. Very likely a kernel for a standard library could not employ such a trick that relies on alignment of input pointers. Certainly not without a fallback. |
|