Hacker News new | ask | show | jobs
by BoiledCabbage 284 days ago
The paper appears to list the 100x speed-up as time to first token. As I understand that doesn't imply 100x in throughout. Is there more listed in the paper itself?