Handwritten Text Recognition for Xournal++ Using Deep Learning | HN Mirror

Y	Hacker News new \| ask \| show \| jobs

	Handwritten Text Recognition for Xournal++ Using Deep Learning (github.com)
	37 points by millimacro 684 days ago

6 comments

Qwertious 683 days ago

This is great news, it's been needed for ages - handwriting is more than just funky OCR, it's OCR as applied to vector lines with a defined stroke order. So for instance, a lowercase e and c might render to the exact same pixels due to the 'loop' of the e overlapping itself, but if we know the stroke started in the middle of the line and then retreads itself, we can know for sure we're looking at an 'e'. That's simply not possible in e.g. Tesseract.

yorwba 683 days ago

This project is just funky OCR, i.e. "offline" handwriting recognition that operates on the pixels of the final image only. That means it works on scans, but can't take stroke order information into account.

What you're talking about would be "online" handwriting recognition, where timing information about each stroke is available.

millimacro 677 days ago

Yes, that's totally correct! The current version of the plugin supports only so called "offline" HTR, which operates on images. This is ultimately determined by the underlying machine learning model.

I have developed another model however (based on a somewhat recent Google paper by Carbune et al. 2020), that operates on pen dynamics and thereby implements online HTR, see here:

https://github.com/PellelNitram/OnlineHTR

This model is open-source as well and will be part of the HTR system for Xournal++ in the future. Feel free to give it a try yourself locally.

One question that has been bothering me a long time and prevented online HTR so far for me is how to find text on a page in temporal domain (i.e. in online domain and not offline domain). If you have any ideas on that, please do let me know as I would greatly appreciate that! One possible way is a transformer model - but again that feels a bit overkill and introduces a context length.

Qwertious 683 days ago

Well, that explains why I could never find a decent stroke-order-aware HWR system that wasn't a service. Sigh. What idiot invented this terminology?

millimacro 677 days ago

Yes, you're right, stroke-order-aware HWR are hard to find. One reason for that is the lack of good datasets for machine learning model training!

As such, my stroke-order-aware attempt over at https://github.com/PellelNitram/OnlineHTR/ uses a dataset from 2000 with around 12,000 samples. Contrary, the internal Google dataset is reported to feature around 16,000,000 samples :-D.

millimacro 677 days ago

This is a great observation!

Currently, the machine learning model only supports offline HTR (i.e. using images) but online HTR (i.e. using pen time series data) is in the making, see here:

https://github.com/PellelNitram/OnlineHTR/

eulgro 684 days ago

I just learned about Xournal++, I had been using Xournal which apparently stopped being developed in 2016. I just tried and it's much more complete.

bzmrgonz 683 days ago

How has your experience been? Maybe you should give them a testimonial here, it's the least they deserve since it's serve you well (assumingly). Also, please consider giving us your perspective from a long-term user, I think you bring a unique perspective to the table!!! HTR is an oasis in this global desert of manual digitization. I did a relatively extensive search for projects a year ago, I don't remember seeing Xournal at all or maybe the old development date threw me off.

millimacro 677 days ago

"HTR is an oasis in this global desert of manual digitization" - hach, this is such a great phrasing! <3

Xournal++ is a great project that features a bunch of really great developers who dedicate a lot of time to it.

I am not much involved in the Xournal++ development itself but then try to utilise my machine learning skills to build an HTR system for Xournal++ in the form of a plugin.

Cheers! :-)

millimacro 677 days ago

I was wondering, how do you capture handwritten notes using Xournal and Xournal++? Using some sort of tablet maybe?

Great to hear that you found out about Xournal++! It's really the power of open-source that you own your own handwritten notes as you are not locked in.

kkfx 683 days ago

A very nice project but... How many really want to scan to text handwritten text? Results will be messy anyway and typically today handwritten text is not more than few pages, far quicker to retype or even dictate than correcting OCR.

BTW personally I use Xournal++ to add text/images to pdfs, typically where I have some crappy low importance pdf-form, not a real one, and I do not want to invest time in a nice LaTeX + cart.el (Emacs artist mode wrapper to get coordinate of any form clicking with the mouse on them [1]). I still have to do with some scanned documents but originally printed from a computer not handwritten.

Handwritten text recognition might be very welcome to scan and index old public archives, witch is damn complex since there are countless of style of cursive, but it's still a very needed thing to merge the old paper world to the digital one not to loose history.

[1] https://github.com/Nidish96/cart.el

millimacro 677 days ago

I totally agree with you!

The HTR feature in Xournal++ does not require you to scan anything though. You just write handwritten notes in Xournal++ as always and upon saving with the plugin, the resulting PDF is searchable (the handwritten texts at least). So there's no scanning or retyping involved :-).

nullc 683 days ago

Even bad OCR is better than nothing in terms of giving you some ability to search.

millimacro 677 days ago

Yes, that's what I think as well!

minimalist 683 days ago

How cool! I have over a decade of notes taken in xournal and other digital tablets and was considering taking a short sabbatical to type them all out. Might not need to after all! I will definitely try this.

millimacro 677 days ago

Oh that's so exciting to hear! Please do let me know if you need help when trying to use the plugin! I am happy to help you until the plugin works for you.

bzmrgonz 683 days ago

1-to-10, how ready for prime time is it?? (10=production ready)

millimacro 677 days ago

I decompose my answer in three prime time metrics below and obtain that the plugin is at 6.4/10.

These are the decomposed prime time metrics:

1. The plugin itself is at 10/10; e.g. I use it in production myself.

2. The prediction quality is at, say, 6/10. I am actively working on this.

3. The installation process is 3/10. That's because I have only tested it using my machine setup, for which it actually works fine. To improve here, I'd love to get a bit of user feedback to improve the installation process.

Cheers!

millimacro 677 days ago

Cheers everyone for your lovely engagement!! <3