Hacker News new | ask | show | jobs
by steelframe 1880 days ago
I've used Calibre extensively for stripping DRM and converting formats for my various e-readers. Note that it does glitch sometimes in strange and assorted ways.

For example, when it converted the book Futu.re, it replaced all instances of "ft" with a blank space. While reading the book my brain has been interpolating over words to "fill in" the "ft" -- for example, when I read "a er" I know the word really is "after."

I don't understand all the ins-and-outs of e-book conversion, but I figure it might have something to do with the font processing. Aside from these occasional glitches, I am a very happy Calibre user.

2 comments

"ft" is a common ligature. When characters are often used together, fonts are often designed with those characters grouped together as their own character so they can be given certain design flair such as having the "branches" of the "f" and "t" as one continuous horizontal line. Aside from "ft," "ll" (ell ell) also often gets this treatment and others I can't recall off the top of my head. Text rendering engines will then substitute the composite character with the ligature if the ligature exists in the font being used for display. So what might have been happening is that the text engine thought that an "ft" ligature existed for the font it was using to show your book and was trying to show it, but in fact it didn't exist. But who knows at the end of the day.

More information about ligatures can be found on the Internet.

Again, Hacker News delivers. Thank you for identifying this as a ligature issue. Maybe I'll be able to figure this out with some deeper poking around Calibre's legendary UI.
It's likely that "ft" was a ligature in the original which didn't survive the conversion (for example, if the destination font doesn't have the character ſt). Sounds like a bug, though.