Hacker News new | ask | show | jobs
by StopTheWorld 998 days ago
I just listened to the beginning of one of these books.

The book text starts out "I." as in, roman numeral for chapter 1. The audio book read this as "eye" (or "I", but not 1).

Then the third sentence in the text says "1793-05". The audio was "1793 to 2005". Actually it was 1793 to 1805.

I listened to a few more sentences and had an idea of the state of it currently. The tech has advanced in the past few years, still has a way to go. Insofar as striking actors, they probably should be worried about what the big companies are up to.

3 comments

Eye sea a lot of otto-generated subtitles with homonym miss takes.
> Then the third sentence in the text says "1793-05". The audio was "1793 to 2005". Actually it was 1793 to 1805.

Bit of a tangent here but did you get 1805 from context or is YYYY-YY a common year range notation that I'm not aware of? Because I would never guess 1793-05 meant 1793-1805 either.

1898-02 or 1998-02 would be the years someone attended high school or college. The context of when it was written, either in 1905 or 2023 can be important. Seems to be pretty common notation when spanning centuries, particularly +/- 10 years. I grew up during the turn of the century and seemed to be quite common.
Sure, it's awkward, but how about "1793 to oh-five"? Instead of inserting incorrect content.
It’s not but it’s obvious from the context, which I think is the point. A GPT-4 LLM could probably figure it out from context, but most text to speech models probably don’t have as much semantic training.
> Because I would never guess 1793-05 meant 1793-1805 either.

Read more books.

Wouldn't that be caused by an OCR issue? No matter how primitive I doubt a TTS model would mistake I for 1. The book was probably poorly scanned, OCRed, and then had automated audiogen.
> No matter how primitive I doubt a TTS model would mistake I for 1.

"I" is the correct text, but it's intended as a Roman numeral and should be read as "one". (And, a few chapters later, "IV" should be read as "four", not "intravenous".)

Ah thanks for clarifying