have you heard of rewind.ai? it sounds like it might be a possible solution for what you're looking for (not affiliated with it though, and also don't have it on my Mac, so not sure how well it works in reality)
Yea, though it's not local. They claim it is, but then use ChatGPT .. which is odd.
Personally i want to build a fairly dumb system though. Ie make a system which can be useful with LLama2 13B or w/e. Something that doesn't require state of the art GPT4+.
If that means compromising on some features that's fine, but at least then it can be truly and fully local.