Y
Hacker News
new
|
ask
|
show
|
jobs
by
billconan
7 days ago
I do not understand.
how is this different from building smaller transformer layers, and each layer just denoises less?