|
|
|
|
|
by seanmcdirmid
780 days ago
|
|
I think on-device models would be really useful. Imagine a conversational interface with much less latency so the conversations felt real. I wonder what kind of computing power we will need before we get there (e.g. running an LLM with lots of prompt data + on device speech recognition), maybe 5-10 years? |
|
With current models the latency comes from processing, not from the network — going from a high-power remote server to a low-power local phone is likely to increase latency more than it reduces it