Hacker News new | ask | show | jobs
by jrochkind1 4210 days ago
That was the original idea behind reCAPTCHA (which originated outside of Google, acquired in 2009), but my understanding is that they long ago ran out of actual text that needed human OCR'ing, and/or found other reasons that approach no longer was helpful.

The "help OCR while also spam protecting" thing isn't currently mentioned on Google's recaptcha product page.

3 comments

It is:

> Creation of Value

> Stop a bot. Save a book.

> reCAPTCHA digitizes books by turning words that cannot be read by computers into CAPTCHAs for people to solve. Word by word, a book is digitized and preserved online for people to find and read.

https://www.google.com/recaptcha/intro/index.html#creation-o...

Good catch.

I wonder where i heard/got the impression that it wasn't really being used for this much anymore. Maybe from when most of the recaptchas most of us saw switched from scanned books to google street view photo crops. And I was also surprised by the implication that google's algorithms really needed human help for visual recognition of almost exclusively strings of 0-9. I would have thought that would be a pretty well solved problem.

Anyway, somehow I got the idea that recaptcha wasn't actually providing much OCR help anymore, but maybe I just made that up.

For the past few years the recaptchas I've seen were illegible text next to easy to read text. I think its obvious that they've run out of the low hanging fruit and now just have the worst of the worst as placeholders. The move to house numbers just proves that they're kinda running out of badly OCR'd text.

This move isn't too surprising. OCR based captchas have always been a hack and the "best" captchas are like having the best collection of duct tape and WD40. At a certain point you need to stop doing half-assed repairs and remodel.

they also used it to decode street number addresses, for street view