Y
Hacker News
new
|
ask
|
show
|
jobs
by
oneseven
1098 days ago
It seems like learned positional encodings would still prevent you from doing fine tuning on a larger context size, though, so maybe using alibi is still relevant (although I have not read that paper).
1 comments
jimsimmons
1098 days ago
You can collapse all positions beyond a length to a specific bucket like T5
link