Hacker News new | ask | show | jobs
by Usu 997 days ago
I'd be interested in knowing how good it is at solving visual captchas, do we foresee a huge rise in automated bypasses?
2 comments

Solving CAPTCHAs at the moment is more inexpensive using humans than using GPT-4 API.
If true, this is wild.

I suppose a human could spend 10 seconds per Captcha, so they could do 360 per hour. Add some overhead for not being operating at peak performance every minute of every hour & call it 250. Let's say you can hire someone for $2, that works out to a bit over a penny per Captcha.

I don't think OpenAI has published pricing for GPT-4 Vision yet, but if we assume it's on par with GPT-4, and uses only 1000 of the 8000 possible tokens to process an image that's 3 cents per Captcha.

Doesn't seem completely unreasonable that at-scale humans may actually be cheaper than LLMs at this point. My mind is a little blown.

You'd be surprised, or perhaps horrified, by how cheap (self-proclaimed) human-based captcha solving services are.

If you just search for "captcha solving service" the first few results that come up offer 1000 solves of text-based captchas for <= $1 USD, (puzzle / JS browser challenge captchas are charged much higher).

Whether these are actually human based, or just impressive OCR services, it seems like they are still much more cost effective than GPT-4 is for now.

I imagine they are a mix.
The way these work is usually presenting an existing captcha to another human who doesn’t even know they’re solving the captcha. For example, sites hosting pirated content serve fake captchas as a way to make money.
We have just added a section on this! TL;DR: GPT-4V isn't great at this task at the moment :)
Back when they leaked it via a Discord bot I found it worked better when you ask it to first describe each box

Without doing that: https://cdn.discordapp.com/attachments/964175221089259591/11...

With it: https://cdn.discordapp.com/attachments/964175221089259591/11...

(though it's only one example so it could be coincidence)

Is it possible they hobbled it a bit? I know CAPTCHA solving was one of the reasons they delayed the roll-out of this feature.
Given that it fails by hallucinating the structure of the challenge instead of refusing to solve a CAPTCHA, I doubt they've intentionally reduced the capability. Although the example in your sibling comment implies it should have enough information to do it.