Hacker News new | ask | show | jobs
by tasty_freeze 366 days ago
That sounds like a great idea for a sleep aid: have an AI narrate random wikipedia pages. Maybe it could even allow you to specify topics you have no interest in so it doesn't accidentally pick a topic that might grab your interest.
2 comments

No need for an AI. Text-to-speech (TTS) is by far good enough and much easier on CPU/GPU and the environment.
NotebookLM's audio mode doesn't just read out the given text, it creates a podcast format with 2 hosts where one will ask questions and the other will answer, and go back and forth in a discussion style.
Using an "AI" (LLM) enhanced TTS adds in tone and other markers to let the underlying TTS sound much more natural. You can then double down with an ML tuned TTS to get a more natural voice.
What's an example of that? Anytging I can run locally?
A paid product, but https://elevenlabs.io/ does it pretty well. There is some work on open source versions you can run locally, they work reasonably well, but I haven't kept up with the FOSS field in several months, so I'm unsure which is currently best
There are some really good open source TTS models out there now. Dia 1.6B or OpenAudio S1 are good options, and you can always check whichever models are trending on huggingface: https://huggingface.co/models?pipeline_tag=text-to-speech
Would you pay for it?
Yes