Hacker News new | ask | show | jobs
by rvanbruggen 814 days ago
LLMs rely on Rotary Position Embeddings (RoPE) to understand the relative position of words within a sequence. Each word in the sequence is assigned a unique embedding based on its position. This embedding is calculated using a combination of sine and cosine functions, incorporating its distance from the beginning and end of the sequence. However, standard RoPE struggles with longer sequences than those encountered during training.