|
|
|
|
|
by adrian_b
473 days ago
|
|
One big server CPUs can have a computational capability similar to a mid-range desktop NVIDIA GPU. When used for ML/AI applications, a consumer GPU has much better performance per dollar. Nevertheless, when it is desired to use much more memory than in a desktop GPU, a dual-socket server can have higher memory bandwidth than most desktop GPUs, i.e. more than an RTX 4090, and a computational capability that for FP32 could exceed an RTX 4080, but it would be slower for low-precision data where the NVIDIA tensor cores can be used. |
|
INT8, INT4, FP8 and soon FP4