Hacker News new | ask | show | jobs
by Stem0037 381 days ago
I wonder how much of this overhead (like the 250µs for activations/consistency on B200) could be further chipped away with even finer-grained control or different sync primitives.