|
|
|
|
|
by machinationu
171 days ago
|
|
Q, K and V are a way of filtering the relevant aspects for the task at hand from the token embeddings. "he was red" - maybe color, maybe angry, the "red" token embedding carries both, but only one aspect is relevant for some particular prompt. https://ngrok.com/blog/prompt-caching/ |
|