| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by rvanbruggen 814 days ago
	LLMs rely on Rotary Position Embeddings (RoPE) to understand the relative position of words within a sequence. Each word in the sequence is assigned a unique embedding based on its position. This embedding is calculated using a combination of sine and cosine functions, incorporating its distance from the beginning and end of the sequence. However, standard RoPE struggles with longer sequences than those encountered during training.