Hacker News new | ask | show | jobs
by lemonish97 185 days ago
Not sure if it's a TPU constraint, but according to this report it seems like the Gemini models have really poor TTFT and tps inference times.