Y
Hacker News
new
|
ask
|
show
|
jobs
Training 3x larger model on the same GPU cards
(
github.com
)
2 points
by
xxr3376
1848 days ago
1 comments
xxr3376
1848 days ago
MegEngine (A Deep Learning Framework) implements DTR. Now you can train 3x larger model by tradeoff a little bit speed for lots of GPU memory.
link