A few days ago, IBM announced a new OCR system[1]. Have you by chance compared it to Microsoft's offering? I'm currently looking for the best-in-class OCR solution for scanned PDF documents.
Call me biased, but I've learned over time that anything that comes out of the Waston team looks good only in PR statements but sucks at production - especially at tasks like OCR. YMMV.