|
|
|
|
|
by vidarh
889 days ago
|
|
The 8s latency would be absolutely intolerable to me. Queen experimenting, even getting the speech recognition latency low enough not to be a nuisance is already a problem. I'd be inclined to put a bunch of simple grammar based rules in front of the LLM to handle simple/obvious cases without passing them to the LLM at all to at least reduce the number of cases where the latency is high... |
|
>user: turn my living room lights off
>llm: {action: "lights.turn_off", entity: "living room"}
Search available actions and entities using the parameters
> user: available actions: [...], available entities: [...]. Which action and target?
> llm: {service: "light.turn_off", entity: "light.living_ceiling"}
I've never used a local LLM, so I don't know what the fixed startup latency is, but this would dramatically reduce the number of tokens required.