Hacker News new | ask | show | jobs
by msoad 996 days ago
is a few milliseconds in latency really a problem for current LLM models? They are already so slow that users are used to waiting 10s of seconds for a response anyways. I feel like until the actual latency of LLM models improve to sub-second, this is not a product that worth the price.
2 comments

One of the offerings is language translation where latency might matter. Though I don't know how fast it is.

Cloudflare doesn't currently have a "not edge" worker, so anything they offer has to be "edge".

I haven't used ChatGPT or others, but Bard seems to answer within 1-2s in my experience. Your point remains, but are most LLMs really much slower than Bard?
gpt4 is slowest IMHO, I use claude2 for most of my non coding needs it's more creative and writes better IMHO, gpt4 is better at tasks, technical, and code.

Claude 2 is very fast too...

but they're also offering more than just LLMs but also image models, sometimes it takes 190 seconds or more on playgroundai.com and 40 seconds on leonardo.ai, and about same on tensor.art.

I'm trying to get an ai Etsy store off the ground and faster gen times would be greatly appreciated.