Y
Hacker News
new
|
ask
|
show
|
jobs
by
Veedrac
2361 days ago
GPT-2 uses attention, which is very memory hungry to train, so probably won't work well. But I agree with your overall point.