Y
Hacker News
new
|
ask
|
show
|
jobs
by
avereveard
623 days ago
well yes but actually no I guess: the transformers benefit at the time was that they were more stable while learning, enabling larger and larger network and dataset to be learnt.