|
|
|
|
|
by lumost
832 days ago
|
|
Yeah debugging would be a pain, but in the context of inference/training unnecessary. There is some set of ops which requires high precision, if I L2 normalize a tensor - I really need it to be normalized. But matmul/addition? Maybe there is wiggle room. Big challenge would be whether any gains could compete with the economy of scale from NVidia. |
|