Hacker News new | ask | show | jobs
by jkoff 519 days ago
I did a double-take at the description since for a second I thought we were building the same tool. This is really cool, and seems like it'd greatly expand the set of podcasts I can listen to.

What I'm working on is different but similarly aimed at breaking through the intermediate plateau. I'm generating comprehensible input in podcast form, targeting the vocabulary used to fit a specific learning goal (e.g. "I want to be able to watch show X without subtitles") and systematically repeating the words at specific intervals to improve retention.

It works well as a prototype. I've listened to it for ~16 hours so far and it does seem to help me with vocabulary acquisition.

I'm still gauging whether I should polish and release it as a product, and would love some feedback and/or sign-ups:

https://letmeknow.jkoff.ca/infinite-ci?utm_source=hn

1 comments

Wow this is super cool but how do you ensure the content is useful and correct?
To be clear, I haven't shared this with anyone because I'm not yet sure that the content is useful and correct.

As far as where I'm at: - I've listened to it in my target language for N hours. To my ear, it sounds correct and I've learned some new words that I then heard used consistently in native media. - Next, I'd like to set it to teach me a language that I already know, so that I can more reliably and easily spot errors. This will require some changes, since my target language is currently hardcoded. - Longer-term, validation based on languages I speak can't generalize 100% to other languages, nor can validation of version N make assertions about version N + 1. Correctness would benefit from native speakers periodically checking results, and usefulness would benefit from user feedback (even if only in the form of engagement or lack thereof).

Which LLM gave you the best pronunciation results?
I first generate a script to be handed to a TTS model. For this step, Claude 3.5 Sonnet works well. For voice synthesis, I've been using Google Cloud's Text-to-Speech API and it's been adequate.