|
|
|
|
|
by unignorant
387 days ago
|
|
yeah, it seems likely the underlying task here (one reasoning step away) was: replace as many fp32 operations as possible in this kernel with fp16. i'm not sure exactly how challenging a port like that is, but intuitively seems a bit less impressive maybe this intuition is wrong but would be great for the work to address it explicitly if so! |
|