Hacker News new | ask | show | jobs
by mafia15 106 days ago
I got tired of my inbox being noisy so I built a classifier instead of paying for yet another AI email app. The pipeline: PII redaction first, then embeddings matched against past labeled emails via vector search, Bayesian sender reputation on top of that, a cross-encoder for reranking, and an LLM fallback only when confidence is low. The fine-tuned model handles ~80% of cases before it ever hits the LLM. Biggest surprise — sender reputation scoring ended up mattering more than I expected. Who sends the email is often a stronger signal than what's in it. It's hooked into Gmail and Outlook. I've been using it daily for a few months and my triage time is basically zero now. Happy to talk about any part of the architecture if anyone's curious.