Given how high and continuing the popularity of the "simple" conversion
of regular PDF forms/tables -- even for the technically-sophisticated HN audience [0] -- if Amazon can deliver on OCR-to-data, that feels like a huge achievement. Not as sexy (or creepy) as Rekognition, perhaps, but almost certainly more day-to-day useful to the many, many professionals who work with documents and legacy data entry systems.
Even if AWS goes the cynical route of making Textract be an upsell to MTurk -- e.g. the Textract output is not reliable enough on its own, but structured for easy piping to a MTurk job -- that's got to be useful for the many folks who send entire pages to MTurk when they just need a couple boxes proofread.
As an example of a more scripted/structured job, ProPublica built out a crowdsourcing framework in Rails to extract data from FCC filings. But even that was quite difficult, because every state/TV station has its own kind of form: https://projects.propublica.org/free-the-files/
There's Google Cloud Vision and Microsoft Cognitive Services that act as competitors to Amazon Rekognition, but AFAIK there's no offering from a FAANG that competes with AWS Textract.
It looks like it's competing with ABBYY (FlexiCapture) and Kofax.
I do maintain some level of skepticism though. It is ocr :D