| I built GoodSMS as a side project, an on-device LLM SMS assistant that drafts fast, context-aware replies to your text messages. Most “AI messaging” tools today operate as cloud services or full chat apps. We wanted something closer to a digital-twin input method—a thin layer that sits on top of your existing SMS app and quietly helps you respond faster. What GoodSMS does • Runs an LLM locally (no cloud round-trip for drafting)
• Reads the incoming SMS text, thread context, and your previous writing style
• Generates 3–5 possible replies instantly
• Lets you accept/edit/paste into your messaging app
• Supports short messages, long-form replies, quick actions, confirmations, and scheduling
• Privacy-first architecture: no message content leaves your device unless you explicitly opt into cloud inference Why we built it Most people lose time triaging simple messages: “ok,” “sure,” “on my way,” “what’s the address again,” etc.
Others tend to miss messages or delay replies because the friction is too high. GoodSMS tries to behave like a personal executive assistant for SMS, especially useful for:
• Busy professionals
• Parents coordinating logistics
• Service providers handling many similar conversations
• Anyone who wants faster, cleaner messaging flow Technical notes • Android-first implementation using a custom input-method wrapper
• On-device LLM inference (quantized 3B–8B models)
• Optional cloud-compute escalation for long messages
• Conversation-thread reconstruction
• Lightweight ranking layer for human-like prioritization
• Zero dependency on carrier APIs Open Questions / Looking for Feedback I would really appreciate feedback from this community on: How to improve the on-device inference/latency tradeoff Whether there is value in adding a plug-in layer (e.g., automate routine replies, reminders, follow-ups) Ideas for a secure way to integrate with RCS and third-party messaging Whether a more advanced “agentic” mode would be useful or too risky I just launched the first public version today.
Happy to answer all technical questions, share architectural details, or discuss edge cases. |