Hacker News new | ask | show | jobs
7x speed improvement for LLaMA in less than 10 lines of code (github.com)
2 points by hack_ml 1071 days ago
1 comments

Is that 5s per token?