Hacker News new | ask | show | jobs
by starchild3001 311 days ago
Latency users experience while getting their answers is a big part of the LLM experience.

Well done model routing is a tremendous leap forward to minimize the latency & improve the user experience.

E.g. I love Gemini 2.5 Pro. But it's darn slow (sorry GDM!). I love the latency I'm getting from 4o. The solution? Just combine them under one prompt, with well done model routing.

Is GPT5 router "good enough"? We'll see.

I think OpenAI is a smart company. And Sama is a tremendous leader. They're moving in the right direction.