Hosted Microsoft OCR library: Free OCR API web service | HN Mirror

Y	Hacker News new \| ask \| show \| jobs

	Hosted Microsoft OCR library: Free OCR API web service (blog.a9t9.com)
	72 points by kargo 3895 days ago

12 comments

adders 3895 days ago

Never used an OCR library before so gave it a challenge with the latest xkcd cartoon.

curl https://ocr.a9t9.com/api/Parse/Image --data "apikey=helloworld&url=http://imgs.xkcd.com/comics/bells_theorem.png" { "ParsedResults": [ { "FileParseExitCode": 1, "ParsedText": "T-as IS CALLED I THEOREM. IT LAS FIRST\u2014 t: 1 wostcop n, FASTER- FIRM-LIGHT com)MCBT10N IS E6SlBLE! EL's rta\u201eNDERSTRtDlNGS ELLS THEX\u00dcI I-IPPPEN VIOLATE LOCALITY", "ErrorMessage": "", "ErrorDetails": "" } ], "OCRExitCode": 1, "IsErroredOnProcessing": false, "ErrorMessage": null, "ErrorDetails": null }

ftcHn 3895 days ago

Doesn't look like it likes xkcd hand lettering or the font that approximates it. I had better luck with a this...

curl https://ocr.a9t9.com/api/Parse/Image --data "apikey=helloworld&url=http://www.uky.edu/Providers/ScannedText/page1s.jpeg" {"ParsedResults":[{"FileParseExitCode":1,"ParsedText":"In 1830 there were but twenty-three \r\nmiles of railroad in operation in the \r\nUnited States, and in that year Ken- \r\ntucky took the initial step in the work \r\nwest of fhe Alleghanies. An Act to \r\nincorporate the Lexington & Ohio \r\nRailway Company was approved by \r\nGov. Metcalf, Jarinary 27, 1830.. It \r\nprovided for the construction and reÔÇó \r\n","ErrorMessage":"","ErrorDetails":""}],"OCRExitCode":1,"IsErroredOnProcessing":false,"ErrorMessage":null,"ErrorDetails":null}

dump100 3895 days ago

Same image in free-ocr.com returned :(

BELL'5 SECONDTI-IEOREI’I: WWINGSGWTW HPPPBI‘BOFHSFHRFTHEYVIOLMELDCNJTY.

bdcravens 3895 days ago

Using the web site for a quick test, it gets me better results than Tesseract. However, it missed some words that free-ocr.com gets every single time. (free-ocr.com seems to have some voodoo magic)

nly 3895 days ago

Isn't this a violation of MS's EULA?

nathantotten 3895 days ago

"One user may install and use copies of the software to design, develop, test and demonstrate your programs. You may not use the software on a server in a production environment."

License: http://www.microsoft.com/web/webpi/eula/windows_runtime_ocr_...

a9t9 3895 days ago

Ouch. I was not aware of this, so thanks for the info! I guess the reason for this surprisingly restrictive license is/was the version 1/first release character of the software (namespace Windows"Preview".Media.Ocr).

The good news: In Win 10 the separate library is gone and the OCR feature is a regular part of Windows (Windows.Media.Ocr namespace). Along with this, the separate OCR runtime license is gone. -> I could not find any hint that the new OcrEngine class (or Windows Store apps in general!) have similar "no server use" restrictions -> I will move the OCR app to a Win 10 platform asap.

And while I can not speak for Microsoft, I have good reasons to assume that the ocr api service is doing Microsoft a favor by advertising the great Win 10 OCR features. My web service allows for quick prototyping and testing on any platform. But ultimately no web api can be as responsive as a native OCR solution - which is only available on the Windows platform.

I would not be surprised if the OCR engine shows up in Windows Server 2016, directly usable from ASP.NET.

a9t9 3891 days ago

Update: I confirmed that Microsoft's OCR.dll is indeed part of Windows Server 2016. More info: http://blog.a9t9.com/2015/10/microsoft-ocr-on-windows-server...

snuxoll 3895 days ago

And right below that:

"ADDITIONAL LICENSING REQUIREMENTS AND/OR USE RIGHTS."

Which defines how you can use it in non-development and testing purposes.

Of course, the following clause is just as damning:

"iii. Distribution Restrictions. You may not"

"distribute Distributable Code to run on a platform other than the Windows Store or Windows Phone;"

a9t9 3895 days ago

This clause does not apply here: I assume it is intended to avoid "hacked" OCR libraries that e. g. work with Win32 apps. But as with any hosted service, I do not distribute any code.

pdkl95 3895 days ago

Were those terms presented as part of the offer before money (or other consideration) changed hands?

If not, who cares what an EULA says, it's not a contract.

sgt 3895 days ago

I tried this API with a document I had lying around which contains a lot of text in different tables. The text is pretty clear but this API was not able to parse the text to the point it's usable.

To give an example, a part of the text read "the limit" and was parsed as as "he imit". This despite it being extremely clear / easy to read for a human.

Update: Took another picture and uploaded a JPG instead of the original PDF. It worked fine this time.

amelius 3895 days ago

Can anybody recommend any good open source OCR libraries that run under *nix?

jarmitage 3895 days ago

Tesseract:

https://github.com/tesseract-ocr/tesseract

http://neilshroff.com/tesseract-ocr/doc/tesseracticdar2007.p...

https://ryanfb.github.io/etc/2014/11/13/command_line_ocr_on_...

w-ll 3895 days ago

I recently started playing with tesseract.

Here's a dockerfile that will install it in a minimal alpine image.

https://github.com/wartron/docker-tesseract

nly 3895 days ago

Not a library but gocr has always been useful for me

flashman 3895 days ago

I got this error; are you a victim of your own popularity?

    {"ParsedResults":[{"FileParseExitCode":-20,"ParsedText":"","ErrorMessage":"Timed out waiting for image parsing result or error generation by OCR","ErrorDetails":"System.TimeoutException: Timed out waiting for image parsing result or error generation by OCR\r\n   bei OCRInteractionLibrary.OCRInteractor.GetResultForImage(String tempPath, String imageName, FileInfo imageFileInfo, Boolean isOverlayRequired, AccesorType accesorType) in d:\\1tmp\\OCRReaderSolution914\\OCRReaderSolution\\OCRInteractionLibrary\\OCRInteractor.cs:Zeile 259."}],"OCRExitCode":3,"IsErroredOnProcessing":false,"ErrorMessage":null,"ErrorDetails":null}

a9t9 3895 days ago

Yep. -> Fixed.

skc 3895 days ago

Is it general knowledge that Microsofts OCR libs are better than Tesseract?

andrewgjohnson 3895 days ago

Does this imply you think Microsoft>Tesseract or are you asking what the consensus is?

skc 3895 days ago

I'm asking what the consensus is. In fact this is the first I've heard of the Microsoft offering.

kennydude 3895 days ago

I'm really not a fan of the random highlighting of text, it just feels incredibly childish and slightly crazy.

Omnipresent 3895 days ago

I wonder if the Microsoft OCR uses Stroke Width Transform - http://digital.cs.usu.edu/~vkulyukin/vkweb/teaching/cs7900/P...

jonathanberger 3895 days ago

Does anyone know who or what organization is behind A9T9? It seems the author/source is somewhat obscured. It would be important to know before using for more than a hobby project.

a9t9 3895 days ago

a9t9 here :-)

You find some information about me on http://blog.a9t9.com/p/about-this-blog.html

(a9t9) is a place where I experiment with side projects, and I want to keep this separate from my day job. Therefore I am somewhat stingy with the personal information on the blog. That said, you are welcome to email me for details.

Omnipresent 3895 days ago

Is there some pre-processing done on the image prior to doing ocr on them and is this using Tesseract?

detaro 3895 days ago

> is this using Tesseract

No, it isn't.

jorgecurio 3895 days ago

how can you create an API around this microsoft ocr library so I can just call localhost:3424/api/Parse/Image ?