Hacker News new | ask | show | jobs
by p1esk 923 days ago
This is for GPUs, not CPUs. GPUs do have lower precision ALUs to do math on fewer bits. Though not 2 bits - I believe there’s support for 1, 4 and 8 bit computation in modern Nvidia cards.

But even without such support there’s a benefit of model size compression so that bigger models can fit in GPU memory, eliminating costly CPU/GPU data transfers.