|
|
|
|
|
by jbondeson
3606 days ago
|
|
On the contrast topic: adaptive thresholding can be very helpful (I believe Bradley Local Thresholding was one I had particular success with) however most of these algorithms work in a grayscale domain which means they are dependent upon which color->grayscale transformation is used[1]. I spent a long time researching full color algorithms but never got to a truly successful end result with them. And even if you get a good image with huge contrast you still will end up with the actual light/dark transition looking like an edge. On 3D deformation, you're officially in academic research land. Nearly all algorithms require you to have a solid guess as to what the aspect ratio of the target object is. Other algorithms use heuristics based upon what you expect to find on a page. One particularly fun algorithm used the baseline of text (I believe for that paper it was Arabic) and fit a high-order curve to it which was then reversed. Unfortunately I haven't seen a truly generic approach that doesn't require a implementation-specific input. [1] Frankly my feeling is that RGB to grayscale is a mistake and holding back many of these algorithms |
|
Agree with that 3D deformation is a difficult open problem, and we haven't gotten into that yet. Currently we assumed the document is a flat rectangle, which maps to a quadrilateral in image space. A homography is then applied to rectify it, and it seems to work quite well if the paper is slightly curved or folded.