Hacker News new | ask | show | jobs
by herzigma 115 days ago
This is great work. Most voice AI optimizes for latency - you made the opposite bet (quality over speed, frontier models over lightweight ones) and that's probably the right call.

The audio pipeline alone is impressive: on-device VAD, parallel TTS chunking, retry-from-failure mid-pipeline. That's not a weekend project; that's production-grade thinking.

Here's the thing that excites me most, though - the *cognitive layer* is wide open. The experience harness is solid, but right now every session starts cold. Persistent user memory, context that makes your 50th conversation meaningfully smarter than your first, light orchestration that turns a single question into a structured multi-step inquiry - that's likely where this goes next, and it's a compelling frontier.

Voice access to frontier reasoning is massively underserved. You've built the right foundation for it.

1 comments

Thanks. Adding memory is an interesting idea. I'd probably want to do it a little differently than what is done in the main apps to avoid overlap. I like the idea of being able to tag conversations and then use that "tag" to organize memory. E.g. You might want to have one set of conversations about fitness, another about travel, another about career, etc. with no overlap between them.