|
|
|
|
|
by danielhanchen
807 days ago
|
|
Oh yes you need all the metadata, but because it's 2 numbers the scale and zero_point, I think the movement of singular digits are super fast to the GPU registers - cannot confirm though. It's like in cuBLAS you do alphaAB + beta*C, and alpha and beta are both scalars which can be on the CPU, and moved to the GPU in nanaseconds. I'm unsure though since I haven't tested it |
|