Nice, local is the right call. What's the local AI model — a small NER model bundled in, or calling out to something? Curious about the size/footprint for a desktop app.
It use openai/privacy-filter which is smaller than 1GB in size. I haven't checked usage during inference. It rans at 1k toks/sec on my macbook. I will update the repo with benchmark results. Thanks for the comment