Hacker News new | ask | show | jobs
by anybodyz 851 days ago
Together serves models optimized for inference speed.

They're not Groq but Together (and Perplexity Labs) have the lowest latencies and fastest tokens per second of any commercial services available right now. Also the lowest prices afaik.