Show HN: We put voice agent on our website, learned retrieval isn't bottleneck

Y	Hacker News new \| ask \| show \| jobs

	Show HN: We put voice agent on our website, learned retrieval isn't bottleneck (moss.dev)
	24 points by srimalireddi 11 days ago

4 comments

anantm 11 days ago

This is super interesting. If it is fast enough, I would like to try on my product site.

link

srimalireddi 11 days ago

Yes, founding agent is powered by Moss’s sub-10 ms retrieval under the hood. Typical retrieval systems can take anywhere between 200-500 ms per turn which kills the experience of live conversation. With the help of Moss, we are able to make Founding Agent converse like human.

link

philosopherr 11 days ago

Looks promising. Curious how it scales as the amount of content grows.

link

srimalireddi 11 days ago

You asked the right question that's blocking many people from productionizing this kind of solution on their website. If we break down the anatomy of the Voice Agent, it looks like this

STT -> Ambient Retrieval(Moss) -> LLM [+ Tool calls -> On-Demand Retrieval(Moss)] -> TTS

Now STT, TTS and LLM output generation are fixed cost and independent of data scales. In reality, a typical landing page and public-facing website content will range from 100's of docs (for startups) to 100K's of docs (for enterprises).

Moss's retrieval stack runs sub-10 ms with the following internal benchmarks -

- P99 of ~5.4 ms for 100K docs in a shared container

- P99 of ~4 ms for 1M docs in a dedicated VM

our R&D team is cranking it to 200M+ docs with sub-10ms promise but sky is the limit for our scale.

link

kowshikchills 11 days ago

Absolute need ! Does it work well with a very dense website ?

link

srimalireddi 11 days ago

Yes it does! It all boils down to the retrieval speed and quality. And Moss is primarily built for this purpose which is now powering the Founding Agent.

link

vaishak2future 11 days ago

Amazing

link

srimalireddi 11 days ago

Thank you!

link