Hacker News new | ask | show | jobs
by memossy 823 days ago
We use v4s, v5es & v5ps. Mostly v5ps, very stable int8 training (versus the horror that is fp8 stability)