Hacker News new | ask | show | jobs
by daenz 2078 days ago
Very cool! I've done something similar for improving an OCR system on crinkled paper[0]. Blender is a powerful and totally underutilized tool for this kind of work

0. https://www.arwmoffat.com/work/synthetic-training-data

4 comments

I've thought about doing this myself! Did it end up improving the OCR system for real world images?
The startup ran out of money before we could find out :) It was sort of a skunkworks project.
Uou this is awesome! And it's very nicely presented in the website. I'm wondering how you mapped from the UV to the 3D model. I would like to add that feature to the addon.
It's been awhile since I've looked at the code, but take a look at the code around this https://github.com/amoffat/metabrite-receipt-tests/blob/mast... for mapping from UV space to image space

TLDR: using a KD-tree, I find the face containing the UV coordinate. Then I transform the UV coordinate to barycentric coordinates within that containing face, then put that barycentric coordinate through the local -> world -> view -> perspective transform matrices

A common approach in rendering engines to convert screen space coordinates to objects is to render a second image with light and shadow disabled where the color uniquely maps to an id. You then can uniquely identify 24 bits worth of objects without needing to maintain a KD tree.
What the heck. This is beyond awesome I totally want to try it out
Does using synthetic training data introduce any problems? How do you ensure your synthetic data matches real data?