Re: CAPTCHA image validation in web form -- WEB400

I spent way too much time last night reading about this, but I have a vested interest in that phpBB CAPTCHA's are clearly broken, I get automated successful registrations from bots around the clock. Bots being PC's owned by people that don't have a clue that their PC is owned and everything they do monitored by thieves. But to each their own.

I block registrations with email domain blocking (web mail and non-US domains used by offending parties) which is ok for my little site of no interest outside the US, but obviously not a solution for much of anyone else.

I block other types of attacks such as dictionary attacks and various URL content attacks by IP address range of ISP's used by offending parties, again non-US and not hurting anyone but attackers, but not a solution for very many others.

Anyway, looking into all this, the phpBB CAPTCHA's are broken by bots but I saw they are not files. When you hover over the CAPTCHA it doesn't have a URL (or whatever) displayed at bottom. I looked at the code and CSS is activating some PHP code to generate a PNG and incorporating it in the display.

So the short of it is that there is no linkage between solution and resource identifier in the case of phpBB, and it is one of the highest priority targets due to numbers. (Again, what's sort of funny here is that a custom site could simply put up "are you a person?" with "no" as first answer and screen out 99.9% of automated usages simply because of obscurity. :)

That was stated over and over out there in different ways. If you're high profile software they will go to great efforts to come up with automated bypasses, if not mostly people will be hitting you and bot-busting solutions won't apply, and whar bots do hit you will be deterred by the simplest challenge question.

The breaking of phpBB CAPTCHA's is due to the simplicity of the characters, no rotation or overlapping. It sounded like even relatively low tech density algorithms were good enough to do well enough for identifying letters in the consistent display pattern phpBB used. That was version 2. They have come up with version 3 with a bit better CAPTCHA but I think I read it's already broken too.

The customized CAPTCHA's using graphics libraries to rotate and overlap characters basically have to be solved by people. Certainly that's what we'll be able to do with Apache code using Java graphics.

rd

Nathan Andelin wrote:

Joe Pluta wrote:
create a new symbolic link in the IFS to the image for each
request, and then send that symbolic link name. You'd run
a regular job to clear out old links.

That's a pretty good idea, but every file has a "signature", so to speak. A clever bot author wouldn't need to rely on a file name. And remember that a learning bot would be downloading and duplicating your entire image library and description cross-reference table, and could be assigning its own name to the image. It would just need to match up the signature. A signature might consist of a combination of factors such as total # of bytes, plus various byte comparisons interspersed throughout the stream.

It's not that I have a better idea. I don't, but wish I did. The TicketMaster case kind of jarred me into the reality that a bot author may have a strong financial interest in overcoming a captcha.

Nathan.