Y
Hacker News
new
|
ask
|
show
|
jobs
by
mhartz
979 days ago
Can someone help me understand Figure 2? Why does the newest token appear at the beginning of the sequence rather than next to its neighboring token?
1 comments
nivekkevin
979 days ago
it's a rolling buffer, so it just upsert index % 4 in this case
link
mhartz
979 days ago
Thanks, so does that mean position within the buffer is irrelevant?
link
nivekkevin
979 days ago
it does feel like so, the position eventually loses its meaning as more and more data gets crunched by the training process, eventually it's just a context of the past 4 tokens it feels like
link