Hacker News new | ask | show | jobs
by sunshinesfbay 691 days ago
It would appear that Flash-3 is already something that exists for PyTorch based on this joint blog between Nvidia, Together.ai and Princeton about enabling Flash-3 for PyTorch: https://pytorch.org/blog/flashattention-3/
1 comments

Right - my point about "follows the same path" mostly revolves around llama.cpp's latency in adopting it.