Y
Hacker News
new
|
ask
|
show
|
jobs
by
zuzun
1125 days ago
If I understand it correctly, you are only attending preceding tokens in your paper. Can the constant bias matrix be made symmetric for unmasked tasks?