Hacker News new | ask | show | jobs
by yxiongdropbox 3600 days ago
Yep, we turned RGB into LUV space before extracting edges, which helps a lot on contrast and keeps essential edge information that could've been lost if converted to grayscale.

Agree with that 3D deformation is a difficult open problem, and we haven't gotten into that yet. Currently we assumed the document is a flat rectangle, which maps to a quadrilateral in image space. A homography is then applied to rectify it, and it seems to work quite well if the paper is slightly curved or folded.

1 comments

Excellent. It's a little funny how when you start problems like these you start becoming an expert in fields you never thought you'd have to play in like color spaces, color perception theory, etc.

Great work, and I look forward to seeing future posts on the solutions you've been able to come up with!

Yeah, I got a serious education doing this for mail items. And I had it easier as I was able to control the background and lighting and camera and everything.

Well, I couldn't control the autofocus very well, going from a $500 DSLR to a $1200 DSLR made HUGE gains since it'd have far, far more autofocus points.

I was really interested in the text output of the OCR that I later did (which was a treat in itself since mail has so many different fonts, even on the same item!). I learned a lot about a lot of things too.