There was a slashdot thread or two on CAPTCHA's a few weeks ago, but
no one really offered anything very helpful about what is going on out
there.
Lots of quibling over how certain MSFT entities practice it in a
substandard way, but for the most part that's just /. being /. However,
when I looked at the example CAPTCHA images, they were trivially
straightfoward letters for OCR'ing, relatively lined up and well separated.
Displaying in different colors including pastels really screws OCR
up, but it's not necessary. The key is to overlap the characters
somewhat with characters tossed and turned.
I agree with the suggestion to just generate these images with
random number of characters (from three to five, for example) generated
at positions that overlap at least two of the characters and store a set
of them on IFS with answers in a file keyed by the file name as
suggested (by Nathan I think).
My vague understanding from lots of /. references is an implication
that CAPTCHA's are forwarded to very, very low paid people assisting URL
spammers (not necessarily worded that way elsewhere, my description) to
reply to the CAPTCA's. Given that most spamming attempts come from bot
networks of random owned PO's, and that responses are fairly quick, it
is onconcievable to me that OCR software algorithms have been downloaded
to owned bot PC's or that the CAPTCHA images are forwarded and OCR'd
elsewhere.
In any event, as I suggest here to do, most CAPTCHA's are not
OCR'able anyway due to overlapping and/or very difficult to separate
from background characters.
Nathan's suggestion is really quite simple and the way to go.
rd
Nathan Andelin wrote:
Quoting from the Wikipedia article on CAPTCHA:
"Breaking a CAPTCHA generally requires some effort specific to that
particular CAPTCHA implementation, and an abuser may decide that the
benefit granted by automated bypass is negated by the effort required
to engage in abuse of that system in the first place."
With that quote in mind, a hacker might be more willing to spend the time to break a CAPTCHA algorithm offered via popular web service, thinking that it would automatically compromise all the sites that relied on that particular Web service. If that matters.
Nathan.
As an Amazon Associate we earn from qualifying purchases.