|
|
|
|
|
by makerdiety
853 days ago
|
|
Maybe computer processor hacks are used? Like, it's the equivalent of finding the eigenvalues of a matrix. I'm not as familiar with CPUs as I am with mathematical concepts. I don't know what the name for the processor bit hacking tricks is called. But that's maybe the general idea for data compression for LLMs/transformer models on CPUs, I think. After all, notice how data compression improvements are only multiples of two. 128k tokens and 2048k tokens. There's an implementation dependent CPU optimization hack going on in there somewhere. |
|