Hacker News new | ask | show | jobs
by freediver 656 days ago
The problem with this is tok/sec does not tell you what time to first token is. I've seen (with Groq) where this is large for large prompts, nullifying the advantage of faster tok/sec.