Hacker News new | ask | show | jobs
by jimmySixDOF 924 days ago
and which is why the speed up is proportional to context length so starting near parity then, theoretically, see 100x at 100k tokens