Hacker News new | ask | show | jobs
by chad1n 571 days ago
I've built 3 iterations of captcha solvers for that crappy website based on https://github.com/drunohazarb/4chan-captcha-solver/issues/1 . The only thing I've learned along the way is that it's mostly pointless outside of a "learning" exercise, since they'll change the captcha (in terms of letter count or the entropy background). Initially, it was 4 characters with pretty obvious background, then it turned to 5, then it was both 4 and 5 and the current iteration which is also either 4 or 5, but with a lot of entropy surrounding the characters.
2 comments

This project was really my first decent introduction to computer vision and machine learning (along with that of those who helped me in various ways; none of them desired to be credited here other than the guy who collected some of the data for me.)

It was definitely a successful learning exercise, and it's made me more confident tackling some other problems I've had in mind for awhile.

To help you out if you're interested:

- a smeared gaussian in one axis and another in another axis can really help segmenting chars, finding lines of text in OCR

- You can unshear chars using the Radon or Hough transform as a basis to understand the angle

Went through MNIST a few weeks ago and I agree it's interesting!

I am always interested! Thank you for the tips, I'll definitely research these.
Shearing is a linear operation that should be trivial for a NN to learn. Have you found that unshearing is actually useful? Was it to feed the image to an existing OCR program?
How did this project help you to learn computer vision? I'd also like to write a basic captcha solver as an intro, but superficially this project just looks like a dump of generated code.
What do you mean by "generated code"? All of the code in the linked GitHub repo was written by me, with the assistance of a couple friends who helped here and there, but didn't request to be credited.

I learned a lot because I had to do a ton of research and experimentation (fancy word for trial-and-error) to write the code and have it work as I expected.

I think there's been a misunderstanding. I didn't understand you were the author of the linked article, and read the following exchange to mean you'd found the code at https://github.com/drunohazarb/4chan-captcha-solver to be a helpful introduction:

> > I've built 3 iterations of captcha solvers for that crappy website based on https://github.com/drunohazarb/4chan-captcha-solver/issues/1

> This project was really my first decent introduction to computer vision and machine learning

I see now that your code is linked from the article, and looks really informative - thanks for sharing!

In the article it mentions they changed the number of characters in the captcha after he trained the model, and the model could still solve it
Changing the number of characters barely registers as a change. They merely need to use a variety of fonts (according to the post right now there are a grand total of 15 possible glyphs which is tiny) and it would vastly increase the difficulty of generating the training set, and probably affect model accuracy by a lot. Not to mention more complex backgrounds. What’s seen here is an ancient and relatively simple form of captcha.