Hacker News new | ask | show | jobs
by xxr3376 1849 days ago
MegEngine (A Deep Learning Framework) implements DTR. Now you can train 3x larger model by tradeoff a little bit speed for lots of GPU memory.