Hacker News new | ask | show | jobs
by RandyOrion 68 days ago
Check out Fig. 6 in this paper, it shows the comparison between the proposed method and pytorch native FSDP offload method.