| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by starchild3001 357 days ago

Latency users experience while getting their answers is a big part of the LLM experience.

Well done model routing is a tremendous leap forward to minimize the latency & improve the user experience.

E.g. I love Gemini 2.5 Pro. But it's darn slow (sorry GDM!). I love the latency I'm getting from 4o. The solution? Just combine them under one prompt, with well done model routing.

Is GPT5 router "good enough"? We'll see.

I think OpenAI is a smart company. And Sama is a tremendous leader. They're moving in the right direction.