Hacker News new | ask | show | jobs
by connicpu 605 days ago
I've been thinking about trying the OpenAI integration for home assistant[1], because controlling things in my home is primarily what I use my assistant shortcut for. The normal assistant works well enough but can be frustrating if you don't remember the exact phrasing it wants to activate a certain command.

[1]: https://www.home-assistant.io/integrations/openai_conversati...

2 comments

I have it set up with ollama. It’s… interesting. HA commands are provided to the model as tools so it works as well as the model is able to determine when and how to use tools. From experimenting with that and my own tool use code I’ve found that models vary greatly in their ability to wield tools and none that I’ve tried are exceptional.

It’s neat that you can intermix general chatting with HA commands but you’re probably going to find that the old assist is more reliable for commands. What I do like is that you can use a template as your system prompt so you can provide the state of a number of entities and then ask for them with natural language. That works well.

I have an Alexa/Echo voice announcement system set up and have recently tied that into assist so I can do automations like if the garage opens I prompt for “what is the state of the garage?” and announce the result. Makes it feel more humane than the same plain announcements all the time.

Do try it. I've been running it ever since it got integrated into the core, mostly to control A/C units around our flat, and it's the best voice assistant experience I had to date.

I mean honestly, how is it possible Amazon, Apple, Google and Microsoft[0] all keep screwing this up for over a decade now? I literally spent 15 minutes hooking up GPT-4 to the Home Assistant integration, and I was able to semi-reliably[1] control actual devices[2] like air conditioners and smart lights, in a completely natural and ad-hoc way, by talking to my smartwatch on the go, or to a phone, whatever was more convenient at the moment.

It's a really magical experience, a step closer to Star Trek reality. And what makes it possible is not just LLMs being able to deal with natural language, but more importantly, "bring your own API key" model allowing to cut away all the bullshit that FAANG assistants are stuck in.

--

[0] - Ever since they dropped MS Speech API in Windows, and did the Cortana thing. Some 15 years passed, at this point, and I'd still prefer to work with the Speech API than to touch any of the FAANGs' voice assistant - it worked, and worked off-line!

[1] - Works ~90% of the time; some 5% of the time the voice model (from Home Assistant Cloud) misunderstands me, and 5% of the time the LLM gets confused. It's still worth it, because I can actually talk to it like to a person, without thinking of style or grammar or magic keywords.

[2] - Which, given the level of integration of Home Assistant companion app with the phone, can be easily turned into an equivalent of on-phone voice assistant that can do more than the one I got from Google. Critically, there are ways to couple Home Assistant app and Tasker, so it's not hard to make it do arbitrary things on your phone. And, if you don't like low-ish code Tasker experience, you can trivially shell out from Tasker to Termux, at which point sky is the limit. Point being, an enthusiastic non-developer with minimal tech aptitude can beat Google and Apple at the voice assistant game today.