Hacker News new | ask | show | jobs
by upbeat_general 629 days ago
scaled_dot_product_attention isn’t CUDA specific, it even works on TPUs.