Hacker News new | ask | show | jobs
by makerdiety 853 days ago
Maybe computer processor hacks are used? Like, it's the equivalent of finding the eigenvalues of a matrix.

I'm not as familiar with CPUs as I am with mathematical concepts. I don't know what the name for the processor bit hacking tricks is called. But that's maybe the general idea for data compression for LLMs/transformer models on CPUs, I think.

After all, notice how data compression improvements are only multiples of two. 128k tokens and 2048k tokens. There's an implementation dependent CPU optimization hack going on in there somewhere.

1 comments

Such optimizations generally don't change the time complexity