EasyCaptcha() For Multi-Image CAPTCHA Creation

Posted February 11, 2008 at 7:43 AM by Ben Nadel

Tags: ColdFusion

I was thinking of functions to add to the imageUtils.cfc ColdFusion image manipulation component, when I started to think about CAPTCHA. CAPTCHA is a cool thing, but one of the problems with it is that in order to fool bots, sometimes you have to make it unreadable even to humans. I wondered if the fact that it is a single image has anything to do with bot's and spider's ability to beat the CAPTCHA. To play around with this idea, I have created a method called EasyCaptcha(). This method takes the string you want to use in a CAPTCHA image and creates a separate image for each letter and writes it to the browsers response buffer. This way, I figured it might be very easy for humans to read, but more difficult for spiders and bots to figure out on which images it needs to perform character recognition (OCR).

In addition to the text you are going to use, EasyCaptcha() also optionally takes the font size, the canvas background color, and the text color:

  • <cffunction
  • name="EasyCaptcha"
  • access="public"
  • returntype="array"
  • output="true"
  • hint="Outputs a CAPTCHA problem using a series of images rather than one image.">
  •  
  • <!--- Define arguments. --->
  • <cfargument
  • name="Text"
  • type="string"
  • required="true"
  • hint="The text that will be output in the CAPTCHA."
  • />
  •  
  • <cfargument
  • name="FontSize"
  • type="string"
  • required="false"
  • default="20"
  • hint="The font size to use for the CAPTCHA."
  • />
  •  
  • <cfargument
  • name="BackgroundColor"
  • type="string"
  • required="false"
  • default="##FAFAFA"
  • hint="The canvas color."
  • />
  •  
  • <cfargument
  • name="Color"
  • type="string"
  • required="false"
  • default="##333333"
  • hint="The drawing (Text) color."
  • />
  •  
  • <!--- Define the local scope. --->
  • <cfset var LOCAL = {} />
  •  
  • <!--- Set the font properties. --->
  • <cfset LOCAL.FontProperties = {
  • Font = "Courier New",
  • Size = ToString( ARGUMENTS.FontSize ),
  • Style = "normal"
  • } />
  •  
  •  
  • <!---
  • Create the array in which we are going to store the
  • individual images. Each image will represent a single
  • characer in the CAPTCHA.
  • --->
  • <cfset LOCAL.Images = [] />
  •  
  •  
  • <!--- Loop over the characters in the given text. --->
  • <cfloop
  • index="LOCAL.CharacterIndex"
  • from="1"
  • to="#Len( ARGUMENTS.Text )#"
  • step="1">
  •  
  • <!--- Get character. --->
  • <cfset LOCAL.Character = Mid(
  • ARGUMENTS.Text,
  • LOCAL.CharacterIndex,
  • 1
  • ) />
  •  
  • <!--- Get character dimensions. --->
  • <cfset LOCAL.Dimensions = THIS.GetTextDimensions(
  • LOCAL.Character,
  • LOCAL.FontProperties
  • ) />
  •  
  • <!--- Create a new image. --->
  • <cfset LOCAL.Image = ImageNew(
  • "",
  • (LOCAL.Dimensions.Width + 6),
  • Ceiling( LOCAL.Dimensions.Height * 1.5 ),
  • "rgb",
  • ARGUMENTS.BackgroundColor
  • ) />
  •  
  • <!--- Set the drawing color. --->
  • <cfset ImageSetDrawingColor(
  • LOCAL.Image,
  • ARGUMENTS.Color
  • ) />
  •  
  • <!--- Draw character on canvas. --->
  • <cfset ImageDrawText(
  • LOCAL.Image,
  • LOCAL.Character,
  • 3,
  • Ceiling( LOCAL.Dimensions.Height * 1.1 ),
  • LOCAL.FontProperties
  • ) />
  •  
  • <!--- Add the image to the return array. --->
  • <cfset ArrayAppend(
  • LOCAL.Images,
  • LOCAL.Image
  • ) />
  •  
  • </cfloop>
  •  
  •  
  • <!---
  • Create a local buffer to which to save the images. We
  • are doing this so that we can strip out the white space
  • between the individual images to make a cleaner output.
  • --->
  • <cfsavecontent variable="LOCAL.Buffer">
  •  
  • <!--- Loop over images array. --->
  • <cfloop
  • index="LOCAL.Image"
  • array="#LOCAL.Images#">
  •  
  •  
  • <!--- Write character to response buffer. --->
  • <cfimage
  • action="writetobrowser"
  • source="#LOCAL.Image#"
  • format="gif"
  • />
  •  
  • </cfloop>
  •  
  • </cfsavecontent>
  •  
  • <!--- Strip out all white space that is not in a tag. --->
  • <cfset LOCAL.Buffer = Trim(
  • REReplace(
  • LOCAL.Buffer,
  • "\s+(?=<)",
  • "",
  • "all"
  • )
  • ) />
  •  
  • <!--- Write the buffer out to the repsonse. --->
  • <cfset WriteOutput( LOCAL.Buffer ) />
  •  
  •  
  • <!--- Return image array. --->
  • <cfreturn LOCAL.Images />
  • </cffunction>

Be careful, the EasyCaptcha() ColdFusion user defined function makes use of another UDF, GetTextDimensions() so that it can figure out how big to make the individual letter images. The images are written to the browser such that no white space is included between the individual images; I figured this would allow for the most flexibility in styling. To demonstrate, I have created a little test page that outputs EasyCaptcha() using two different styles:

  • <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  • <html>
  • <head>
  • <title>EasyCaptcha() Demo</title>
  •  
  • <style type="text/css">
  •  
  • p#captcha-one {}
  •  
  • p#captcha-two img {
  • background-color: #E0E0E0 ;
  • border: 2px solid #000000 ;
  • margin-right: 10px ;
  • padding: 2px 2px 2px 2px ;
  • }
  •  
  • </style>
  • </head>
  • <body>
  •  
  • <p id="captcha-one">
  • <cfset EasyCaptcha(
  • "MadSexy",
  • 18,
  • "##660000",
  • "##FFFFFF"
  • ) />
  • </p>
  •  
  • <p id="captcha-two">
  • <cfset EasyCaptcha(
  • "MadSexy",
  • 18,
  • "##660000",
  • "##FFFFFF"
  • ) />
  • </p>
  •  
  • </body>
  • </html>

Running the above code, we get the following output:


 
 
 

 
EasyCaptcha() Output - Creating CAPTCHA Using Multiple Images  
 
 
 

As you can see, we can have the CAPTCHA present to the user as if it was a single image or we can style it so that it looks like individual images. I am not sure if one or the other will affect the effectiveness at defeating bots. Heck, I am not sure if this will be effective at defeating bots at all, but I thought it would be a cool experiment. My hope is that this can create a CAPTCHA that is really easy for web users to understand but much more difficult for bots to decipher.




Reader Comments

Feb 11, 2008 at 7:38 AM // reply »
211 Comments

While it looks cool, I don't think it's a real "CAPTCHA" per-say. There has to be some kind of background noise or retarded letter distortion that makes you guess at the word for at least 30 minutes.


Feb 11, 2008 at 8:29 AM // reply »
11,238 Comments

Ha ha, 30 minutes :) I have seen CAPTCHA that I couldn't solve even with repeated bouts of incorrect submissions.


Feb 11, 2008 at 8:52 AM // reply »
170 Comments

@Ben:

The reason CAPTCHA uses all the background noises and tilted letter is to try to prevent OCR programs from reading the images. Of course supposedly now spammers have built code that can bypass some of the more common CAPTCHA images.

Anyway, splitting the images is interesting, but I wonder if splitting the images up in the *middle* of each letter might be more effective.

Granted, a computer could still possibly stitch the image back together, but you might be able to obfuscate that enough. Besides, if it's not a widely adopted CAPTCHA implementation, the odds of a spammer righting code to circumvent it is very slim.


Feb 11, 2008 at 9:00 AM // reply »
11,238 Comments

@Dan,

I considered the splitting up of images mid-character at first. I was actually thinking of writing the image and cutting it up in something like 10x10 pixel images and then putting a pixel space between each image. But, then I decided to just start out simple and see what people thought.

But I agree, as long as it's not a popular CAPTCHA method, the chances are someone will take time to write out the OCR algorithm, especially for us bloggers, is slim.


Feb 11, 2008 at 9:12 AM // reply »
211 Comments

You underestimate the sheer power of being able to spam blogs with links to porn site, etc. This is an SEO war and bloggers are unfortunately aiding and abetting.

I read in the news once, I believe on MSNBC.com, that spammers were passing around this little porn widget. A little stripper would dance. Then she would stop and a little message would appear "If you'd like her to continue, type the phrase in the box." What was it? It was a captcha. Little did they know that people were involved in a social engineering tool of helping spammers crack captcha by building an image library.

Article:
http://www.msnbc.msn.com/id/21566341/


Feb 11, 2008 at 11:06 AM // reply »
21 Comments

@Todd,

I hadn't heard about that scam! You have to admit that's rather clever. If your software can't beat the Turing test, then you just trick some humans into helping you. Could be a great plotline on The Sarah Connor Chronicles!

In all seriousness, any individual CAPTCHA technique is only going to be effective for a limited time, until spammers figure out how to circumvent it. Which requires us to continually invent new techniques. It's an arms race, and new ideas like Ben's are exactly what we need to keep the other side at bay.

Rock on, Ben!


Feb 11, 2008 at 5:33 PM // reply »
11,238 Comments

@Todd,

That article is bananas!

@Dave,

It's not too hard to come up with something that will be a bot... the problem is that so often that ALSO beats humans :) I have definitely come up against several CAPTCHA style problems that I could not seem to get. The trick is to make something that is easy for the human brain and very hard for the computer one.


Feb 13, 2008 at 4:17 PM // reply »
28 Comments

Absolutely, the trick is to make something that is easy for the human brain and very hard for the computer one. Sadly, one of the great purposes of software development and microchip development is to make a computer brain that works exactly like a human's does. There are lots of little advances that we all think are cool and useful-- such as OCR software for scanning documents, stitching pictures together to make virtual 3-D tours, or even object recognition for baggage scanners at airports-- which are in turn useful to those who want to make bots appear to be human. I'm not sure where the balance will come out, personally.

Now if only someone could require all bots to be built with Asimov's 3 Laws of Robotics...


Feb 29, 2008 at 12:12 PM // reply »
1 Comments

Hahah that's a captcha? Bitmap comparison defeats it, OCR defeats it. There is nothing hard about it.


Mar 6, 2008 at 8:26 AM // reply »
211 Comments

btw, I came across an interesting blog post:
http://www.codinghorror.com/blog/archives/001067.html


Mar 6, 2008 at 8:33 AM // reply »
11,238 Comments

@Not Me,

I was hoping that it was the fact that it was multiple images that would make it hard, not the actual characters.


Aug 1, 2008 at 9:32 AM // reply »
1 Comments

I've been contemplating different CAPTCHA schemes for a while as well, including one similar to yours. Nice work on the project. I think it might be tougher if a) you split the letters, b) you picked different fonts for each letter, c) fuzzed the image somewhat and d) scrambled them but with numeric cues to let a human put them in the right order. I'm working on a project to attempt all of these and test their utility.


Aug 1, 2008 at 1:30 PM // reply »
11,238 Comments

@Chris,

Sounds like a cool project. Please post your results here (or a link to them) when you are done.



Post A Comment

Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.

Please review the following issues:

Author Name:


Author Email:

Author Website:

Comment:

Supported HTML tags for formatting: <strong>bold</strong>   <em>italic</em>   <code>code</code>







  • Help Wanted - Find Your Next ColdFusion Job
Ben Nadel's Company - Epicenter Consulting Recent Blog Comments
May 19, 2013 at 2:31 PM
My Experience With AngularJS - The Super-heroic JavaScript MVW Framework
It's funny really just how well that image describes the way I would imagine most people that go with angular for some project is. I have had a similar roller-coaster ride with it as well, but not qu ... read »
May 17, 2013 at 7:42 PM
HashKeyCopier - An AngularJS Utility Class For Merging Cached And Live Data
Ben - thanks so much for posting these Angular articles and findings, they've been a huge help towards learning one of the more 'complex' JavaScript frameworks out there (IMO). I have been using Angu ... read »
May 16, 2013 at 5:01 PM
UPDATE: Parsing CSV Data Files In ColdFusion With csvToArray()
Your code was the closest thing I've found to obtaining some direction for converting ISO fields to values that CF can translate properly. Thank you for posting! ... read »
May 15, 2013 at 10:37 PM
Very Simple Pusher And ColdFusion Powered Chat
hi id making plz easy ... read »
May 15, 2013 at 6:07 PM
Making SOAP Web Service Requests With ColdFusion And CFHTTP
Ben, you once again saved my bacon at work. Thank you, thank you, thank you! ... read »
May 15, 2013 at 4:15 PM
What If All User Interface (UI) Data Came In Reports?
@Josh, Thanks! @Ben, I definitely recommend the David West book "Object Thinking" I've been quoting from. It goes deeply into the philosophy and history of OO programming. His breadth ... read »
May 15, 2013 at 11:36 AM
Ask Ben: Print Part Of A Web Page With jQuery
I found this helpfull when you need to keep (refresh) the original parent page after closing the iframe child print dialog (Hoping you're not using a form at this time so it won't submit again): On ... read »
May 14, 2013 at 7:13 PM
What If All User Interface (UI) Data Came In Reports?
@Jonah, If there's any books you'd recommend on the subject of domain modelling, I'd love to hear it. I just downloaded the free PDF of "Domain Driven Design Quickly". Figured I'd give it ... read »
InVision App - Prototyping Made Beautiful With Prototyping Tools