Hacker News new | ask | show | jobs
by fxtentacle 620 days ago
Yes. In the end, LLMs are a sequence of matrix multiplications and since they don't loop internally, every output token gets the same number of internal processing steps, no matter what the input is. Only the input length is relevant because some steps can be skipped if the input buffer is not full.