| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by flybird 253 days ago

I built GoodSMS as a side project, an on-device LLM SMS assistant that drafts fast, context-aware replies to your text messages.

Most “AI messaging” tools today operate as cloud services or full chat apps. We wanted something closer to a digital-twin input method—a thin layer that sits on top of your existing SMS app and quietly helps you respond faster.

What GoodSMS does

• Runs an LLM locally (no cloud round-trip for drafting) • Reads the incoming SMS text, thread context, and your previous writing style • Generates 3–5 possible replies instantly • Lets you accept/edit/paste into your messaging app • Supports short messages, long-form replies, quick actions, confirmations, and scheduling • Privacy-first architecture: no message content leaves your device unless you explicitly opt into cloud inference

Why we built it

Most people lose time triaging simple messages: “ok,” “sure,” “on my way,” “what’s the address again,” etc. Others tend to miss messages or delay replies because the friction is too high.

GoodSMS tries to behave like a personal executive assistant for SMS, especially useful for: • Busy professionals • Parents coordinating logistics • Service providers handling many similar conversations • Anyone who wants faster, cleaner messaging flow

Technical notes

• Android-first implementation using a custom input-method wrapper • On-device LLM inference (quantized 3B–8B models) • Optional cloud-compute escalation for long messages • Conversation-thread reconstruction • Lightweight ranking layer for human-like prioritization • Zero dependency on carrier APIs

Open Questions / Looking for Feedback

I would really appreciate feedback from this community on:

How to improve the on-device inference/latency tradeoff

Whether there is value in adding a plug-in layer (e.g., automate routine replies, reminders, follow-ups)

Ideas for a secure way to integrate with RCS and third-party messaging

Whether a more advanced “agentic” mode would be useful or too risky

I just launched the first public version today. Happy to answer all technical questions, share architectural details, or discuss edge cases.