Hacker News new | ask | show | jobs
by mbil 360 days ago
From the title I expected this would be like talk radio (like NotebookLM style) discussion of random wiki pages.
2 comments

That sounds like a great idea for a sleep aid: have an AI narrate random wikipedia pages. Maybe it could even allow you to specify topics you have no interest in so it doesn't accidentally pick a topic that might grab your interest.
No need for an AI. Text-to-speech (TTS) is by far good enough and much easier on CPU/GPU and the environment.
NotebookLM's audio mode doesn't just read out the given text, it creates a podcast format with 2 hosts where one will ask questions and the other will answer, and go back and forth in a discussion style.
Using an "AI" (LLM) enhanced TTS adds in tone and other markers to let the underlying TTS sound much more natural. You can then double down with an ML tuned TTS to get a more natural voice.
What's an example of that? Anytging I can run locally?
A paid product, but https://elevenlabs.io/ does it pretty well. There is some work on open source versions you can run locally, they work reasonably well, but I haven't kept up with the FOSS field in several months, so I'm unsure which is currently best
There are some really good open source TTS models out there now. Dia 1.6B or OpenAudio S1 are good options, and you can always check whichever models are trending on huggingface: https://huggingface.co/models?pipeline_tag=text-to-speech
Would you pay for it?
Yes
That would be a fascinating next iteration - combining these random audio clips with LLM-generated summaries or discussions of the Wikipedia articles they're sourced from.