Hacker News new | ask | show | jobs
by unnouinceput 2229 days ago
Why was self-generated and self-hosted captcha easy to beat?

I found that generating math questions in a captcha style (curved / with other noise drawing over) and requiring that questions to be answered in a box is unbeatable. The bad actor would require very good OCR and after that also good math parser to answer. Easy for human, very hard for automation. And the script was like 50 lines long that did that.

1 comments

"easy for human" is very subjective. Users very regularly have a hard time with all forms of image captcha for a whole bunch of different reasons: visual acuity, color deficiency, learning disability, unclear instructions, visually similar characters, etc. If you allow users to refresh the image until they see an easy one they might be able to overcome it themselves but some percentage of those users will get frustrated and leave. Not to mention allowing regeneration of images also makes it easier for bots to cycle until they find one they're confident in. Surely if there were a dead simple for humans, difficult to beat for bots, 50 line script option for CAPTCHA generation that could be self hosted it would be in wide use.

reCAPTCHA changed to its current model to try to significantly reduce friction in the "hopefully normal" case (down to just a check box if all goes well) because every ounce of friction you add to critical inflection points in your product translates to meaningful lost opportunity.

Even if this wasn't a problem, and it were trivial to create something that's easy for humans and hard for computers, it's just not worth most companies' time. Would they rather spend a few days properly implementing and testing a captcha solution, then whatever unknown time on future bug fixes and support, or setup reCAPTCHA in 30 minutes and move on to things that produce value for their customers?

I see that as an absolute win. If you're having problems understanding simple math questions then I won't want you as my user in the first place. Morons out.

As for visual impaired ones, I agree this one is harder to crack. Usually you do it by audio, which in itself is more then 50 lines of code, but here is my personal approach. Absolutely none is stopping you to have, for visual impaired ones, a separate step like the one described in OP, where you have mail activated. You see visual impaired users have infinitely more patience then normal "visual" ones. They are used for web to not be friendly, so they won't mind going through extra hoops if they want your service. So a checkbox saying "I am visual impaired and I want registration by e-mail" or something equivalent and you're good to go.