|
|
|
|
|
by spmurrayzzz
431 days ago
|
|
Ah, got it. Yea, then I'd focus on learning how RoPE works first. That will at least help you understand how the retrieval in current long context implementations is so limited. A colleague from a discord I spend time in threw together this video a year or so ago, might be helpful as a first watch before a deep dive: https://www.youtube.com/watch?v=IZYx2YFzVNc Covers positional encoding as a general concept first, then goes into rotary embeddings. |
|