Hacker News new | ask | show | jobs
by klakierr 1036 days ago
An app for language learning by reading books in the target language. It takes the book (any epub book) in the target language add inserts translations:

—Arthur —dijo con tono cortante, ["Arthur," he said sharply.] y su voz sonó como el chasquido de una ratonera—, [and his voice sounded like the click of a mousetrap.]

All of the existing apps that I checked only display translation upon clicking a word (like in Kindle), but that 1) doesn't work well when you're only starting to learn; 2) takes too much from the reading experience, and becomes a chore. Also, most of them only allow a limited selection of books.

I'm using it myself for over a month now, and enjoy reading with it a lot. I feel almost no friction using it, it's the usual pleasure of reading a book with added exploration of a new language.

The problems:

Translating can be hard or / and expensive. Cheaper / free translations are often incorrect, and struggle with idioms. DeepL would cost > 20€ for a long book. ChatGPT hallucinates, and adds to the original. I'm using Google Translate for now and it's good enough for me, but I don't feel it's good enough to charge for. It often mixes genders (as opposed to ChatGPT which can deduct them from context I guess), and occasionally mistranslates.

Would you want to use such app even with the often erroneous automated translations?

6 comments

This sounds very similar to the books manually curated using Ilya Frank's Reading Method: http://english.franklang.ru/index.php/ilya-frank-s-reading-m...
That sounds cool, I had a similar idea a while ago but it was a browser extension. I only pulled out keywords from news articles with spacy and then ran them through deepl, I was trying to make automated flashcards to help learn to read articles in your target language.

Have you tried llama2? Running that yourself might be cheaper, you could also maybe crowd source translation fixes eventually somehow.

This is very cool, looking forward to it! I've been doing the same thing with Spanish Wikipedia articles for a while, using a few lines of Bash + Regex. I was using Apertium for it. https://apertium.org/ It's definitely worse than most ML-based solutions, but it works reliably, deterministic, and fast; you can run it entirely offline. With Spanish translations, the main problem I was facing is lack of vocabulary, so I created https://github.com/phil294/apertium-eng-spa-wiktionary which about doubles the amount of recognized words, albeit with wonky grammar.
€20 is not expensive to translate a book. Google translate is of such low quality that it is not suitable for any real world use like yours.

Can you let the user herself pay the translation cost when loading her favorite book into your software?

Yeah I intend to offer different translation options and be transparent with pricing
I'm interested in something like this. I would be even more interested in a curated list of books by language level. Maybe you can offer a couple of books per level and cache those translations, minimizing costs and giving users an easy way to get started reading in a new/intermediate language.
If it works well I would use it (and pay for it)