Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at CFUNITED 2008 (Washington, D.C.) with:

Preventing Spam Bot Form Submissions With ColdFusion (Revisited)

By Ben Nadel on
Tags: ColdFusion

The other night, I was staring at my neighbor's lower-back tattoo when a new (to me) ColdFusion anti-spam technique popped into my head. I am sure that this is not new or unique, I just haven't explored this before. Until now, all of my ColdFusion-based Anti-Spam methodologies have resorted to CSS and TimeStamp-based tomfoolery. Well, I don't know if it was the graphical nature of my neighbor's sacral inkage, but suddenly I had the idea of using Images.

Images on a web page do not get loaded with the initial page request. Instead, as the HTML is rendering, the client makes subsequent requests to the server to load linked items like Images, Javascript files, and Style Sheets. Can we use this multi request paradigm to our advantage? I think so, at least with anyone who has a graphical browser.

The idea here is that Spam Bots probably don't ever render the form pages they spam; most likely just grab the HTML and then use that to programmatically submit the form. Because they never render the HTML page itself, they never make subsequent requests to the server to load images, stylesheets, and the like.

That's where this new plan comes into play. On the form page, we have an image tag that pings a ColdFusion page which causes some ID-based flag to be set. This ID is then also submitted with the form. When the request gets processed, you then check to see if the both the flag and the form-submitted ID exist (and match). If they do, then it proves the HTML page was rendered and that it was most likely not a bot.

To demonstrate, let's first look at the ColdFusion template that causes the server side flag to be set:

  • <!--- Kill extra output. --->
  • <cfsilent>
  •  
  • <!--- Param the URL id. --->
  • <cfparam
  • name="URL.id"
  • type="string"
  • default=""
  • />
  •  
  •  
  • <!--- Try to decrypt it and create a text file. --->
  • <cftry>
  •  
  • <!--- Decrypt the value. --->
  • <cfset URL.id = Decrypt(
  • URL.id,
  • "that-is-tasty!",
  • "CFMX_COMPAT",
  • "HEX"
  • ) />
  •  
  • <!---
  • Create the text file that will mark the form
  • submission as valid. Just store it as an empty
  • text file since all we are going to be doing
  • is checking for its existence.
  • --->
  • <cffile
  • action="write"
  • file="#ExpandPath( './spam/#URL.id#.txt' )#"
  • output=""
  • />
  •  
  •  
  • <!--- Catch any errors. --->
  • <cfcatch>
  •  
  • <!--- Something went wrong. --->
  •  
  • </cfcatch>
  • </cftry>
  •  
  •  
  • <!--- Return an empty image. --->
  • <cfheader
  • name="content-length"
  • value="0"
  • />
  •  
  • <cfcontent
  • type="image/gif"
  • reset="true"
  • />
  •  
  • </cfsilent>

As you can see, practically nothing going on here. When the request comes in, we are decrypting the form ID in the URL scope. We then create an empty text file based on this form ID. This could just as easily have been an APPLICATION-scoped variable or something, but I figured this would be easier on the server's memory.

Now that we understand how the server-side, ID-based flag is being set, let's take a look at the Form page:

  • <!--- Kill extra output. --->
  • <cfsilent>
  •  
  • <!--- Param form comments. --->
  • <cfparam
  • name="FORM.comments"
  • type="string"
  • default=""
  • />
  •  
  • <!---
  • Param the form ID. This is the value that we
  • will use to check proper form submission (to
  • protect against SPAM form submissions).
  • --->
  • <cfparam
  • name="FORM.form_id"
  • type="string"
  • default=""
  • />
  •  
  • <!--- Param the form submission. --->
  • <cftry>
  • <cfparam
  • name="FORM.submitted"
  • type="numeric"
  • default="0"
  • />
  •  
  • <cfcatch>
  • <cfset FORM.submitted = 0 />
  • </cfcatch>
  • </cftry>
  •  
  •  
  • <!--- Check to see if the form has been submitted. --->
  • <cfif FORM.submitted>
  •  
  • <!---
  • Check to see if the FORM is valid by checking to
  • see if the ks_stats.cfm file spawned a file with
  • the given ID.
  • --->
  • <cfif FileExists(
  • ExpandPath( "./spam/#FORM.form_id#.txt" )
  • )>
  •  
  • <!---
  • The file exists. This confirms that the FORM
  • page was actually loaded and spawned a second
  • IMG request that then spawned this text file.
  • This is probably NOT a spam bot.
  • --->
  • <cflocation
  • url="confirm.cfm"
  • addtoken="false"
  • />
  •  
  • </cfif>
  •  
  • </cfif>
  •  
  •  
  • <!---
  • If we have made it this far, then we are going
  • to be showing the FORM again. Select a new form
  • ID for this display.
  • --->
  • <cfset FORM.form_id = CreateUUID() />
  •  
  • <!---
  • Now that we have our form ID, let's encrypt it
  • so that we don't have duplicate values in the body
  • (that might be detectible pattern by a BOT).
  • --->
  • <cfset FORM.encrypted_form_id = Encrypt(
  • FORM.form_id,
  • "that-is-tasty!",
  • "CFMX_COMPAT",
  • "HEX"
  • ) />
  •  
  • </cfsilent>
  •  
  • <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  • <html>
  • <head>
  • <title>ColdFusion Anti Form Spam Idea</title>
  • </head>
  • <body>
  •  
  • <cfoutput>
  •  
  • <form action="#CGI.script_name#" method="post">
  •  
  • <!--- This will flag form submission. --->
  • <input
  • type="hidden"
  • name="submitted"
  • value="1"
  • />
  •  
  • <!--- This is the form ID. --->
  • <input
  • type="hidden"
  • name="form_id"
  • value="#FORM.form_id#"
  • />
  •  
  •  
  • <label for="comments">
  • Comments:
  • </label>
  •  
  • <textarea
  • id="comments"
  • name="comments"
  • cols="50"
  • rows="10"
  • >#FORM.comments#</textarea>
  •  
  •  
  • <input type="submit" value="Submit Comments" />
  •  
  • </form>
  •  
  •  
  • <!---
  • This is the image that we will use to make sure
  • the HTML of the current form page actually renders.
  • I am calling it "ks_stats" just to make it less
  • obvious to prying eyes.
  • --->
  • <img
  • src="ks_stats.cfm?id=#FORM.encrypted_form_id#"
  • height="1"
  • width="1"
  • style="display: none ;"
  • />
  •  
  • </cfoutput>
  •  
  • </body>
  • </html>

If you look at the bottom of the page, you will see that I have an invisible IMG tag that pings our ks_stats.cfm file (the first file shown above) using an encrypted version of the form ID. I have called it ks_stats.cfm just to disguise it. I have also encrypted the ID so that it would be a harder pattern to pickup. This Ping triggers the previously discussed server-side flag to be set. Once the form gets submitted, we then just check to see if the text file (our server-side flag) exists. If it does, then we are deciding that the submitter is NOT a bot. If it doesn't exist, then we are saying that user IS a bot.

Of course, this is not fool-proof. If a user has their images turned off or they have a text-based browser, then they might be legitimate and yet still classed as a BOT. But then again, all ColdFusion-anti spam techniques that are not 100% content-based are going to have similar trade offs. I am not saying that this is the best way to perform antispam functions. Heck, I am not sure this is even a GOOD way. All I am saying is that it occurred to me. And, chances are, if people are going to have restricted browsers, they are probably going to have Javascript turned off before they start blocking images (that is just my theory).



Looking For A New Job?

100% of job board revenue is donated to Kiva. Loans that change livesFind out more »

Reader Comments

Not a bad tactic there. Although it might become quite a bit harder to figure out if you use it for a real graphic in the site layout. I just say that because if I were to reverse engineer your form submission process and I saw that image tag in the source, that would be the first place I'd explore when my process failed.

That aside, why not use a session variable on the on the receiving page? I'd imagine you could get some odd concurrency issues if you always wrote to the same file and if you were to use different files for each post, you'd have to write a GC process as well.

Reply to this Comment

@Dustin,

Yeah, certainly you could do a SESSION variable as well. I try to keep this stuff as low-level as possible so that it is more flexible for anyone who would want to use it. Plus, I figure if someone DOESN'T:

- Have / accept cookies
- Have Javascript enabled

... then this tactic would still work. But also, not sure if this is even a good thing to do.

Reply to this Comment

Both. You're doing a fileWrite()/fileExists(). I don't think you're meaning for that to be part of the end design of the program, but still I think there's a performance impact there.

Reply to this Comment

You're forgetting accessibility - what if I have images turned off? what if I'm using a Braille reader (which may not bother with loading images)? :)

Reply to this Comment

How about just performing a serverside check using CFFormProtect, free and downloadable from Riaforge?

Since implementing it I haven't had a single Spambot getting through ;-)

Reply to this Comment

@Todd,

Yeah, the file writing / checking would have some overhead, but this could just have easily been done with an APPLICATION scope and StructKeyExists() which would be just this side of instantaneous. The idea was more about the multi-request nature of a graphical site - the actually implementation could be changed.

@Tom,

I am not forgetting about the blind... I am just calling them spammers :) I talked about the pros/cons above. I know this is not a perfect solution.

@Sebastiaan,

Yes, I have heard nothing but GOOD things about CFFormProtect. But I look at the code for it and its sooo long. I know that it's probably the best way to go, but then I wouldn't have as much fun hacking my own stuff together :)

@Ciqala,

I like the pictures, but remember, the main idea here is to get the end user to think less, not more. Having to click pictures requires that 1) they understand what the different animals are called and 2) have to stop and think about it. I want less thinking.

Reply to this Comment

The nice thing about CFFormProtect is that is registers if people have used a mouse to click in the formfield or used a keyboard to type something. Furthermore it introduces a hidden form field that purposefully has to be left blank to pass a server-side test. As most bots just grab the form-fields and programmatically fill in ALL fields, also the hidden *empty* fields, they fail the test.

I myself for a long time wanted to implement something like a CAPTCHA to prevent spam-bots flooding me. But I felt it wasn't user-friendly (usability) nor accessible. A serverside check is the way to go, and when CFFormProtect was suggested to me by a friend - including a call to CFAkismet if you have an API-key, just flag the value in the config file - I implemented it instantly. 30 minutes later the first mails poured in reporting to me what the SPAM-bot had tried to post (with a full CF-dump of the submitted form info and then some - excellent as a back-up if someone's comment is mistakenly marked as spam).

So instead of trying to fix the front-end, implement the fix in the back-end ;-)

Reply to this Comment

@Matt,

I think Todd is confusing "wrong" with "Brilliant!".

Sweet! I finally did it! "Correct! You must be human." Took me a few tries ;)

Reply to this Comment

Hey Ben,

What about creating a UUID with every page load that contains a form. Store the UUID in a database and put the UUID in a hidden form field. When the form is submitted check and delete the UUID from the database or ignore the form submission. You could even insert a number of tries based on the page load or even a timeout through a timestamp.

This would only allow spammers to come to your site and manually submit as the html would have to load.

Reply to this Comment

I find that robust server-side form validation screens out most automated form submits. The bots usually trip on at least one item: valid email address, valid phone, valid zip, maxlength, required field, field type (e.g., numeric, date), etc. So the bot gets an error message from the server-side form validation, and the form is never submitted.

A required blank hidden field sounds like a good, easy, and unobtrusive thing to add. Combined with a required field, I wonder if bots would be smart enough to fill in the required field and not fill in the required blank field.

Reply to this Comment

@Adam,

Agreed. Since there are so many different browser capabilities out there, server side is really the only cross-browser compliant way to do validation.

Reply to this Comment

Why not just do something like this?
<cfset userAgent = "#CGI.HTTP_USER_AGENT#">
<cfif #find("Mozilla", userAgent)#>
<!--- Not a Bot --->
<cfelse>
<cflocation url="index.cfm?FuseAction=Main&m=0">
<cfabort>
</cfif>

Most bots I have seen don't have the USER_AGENT filled or have junk in it.

Reply to this Comment

Ben,

Understood, but from my sites and using this has stopped about 99% of the bots from hitting my forums.

Mike

Reply to this Comment

@Mike,

I will look into this. My only concern is that I that I also know there are some users who's firewalls will strip out all the CGI information from a request. So much of the trouble with FORM SPAM is not necessarily blocking the bots - it's trying NOT to block real life users.

Reply to this Comment

I have a very simple technique that works. Requires no javascript, or complicated server weirdness, is fully accessible and has low overhead. It does however require that the user fill out one field with a number, so it's good for anti-bot but not for humans. In three years, I have yet to receive *any* spam on any of the many sites that I have installed this on. Sites that were being bombarded with hundreds of spams daily suddenly became quiet and good emails get though. It takes moments to install, has negligible overhead.

Here is the barebones version.

<!--- Generate a random number as a param (I use 4 digits) at the top of your form --->
<cfparam name="session.chk_rand" default="#NumberFormat(RandRange(0, 9999),'0000')#">

<!--- Place this field in your form, right next to the submit button --->
#session.chk_rand# enter this number here -></string><input name="spmchck" type="text" size="4" maxlength="4"/>

<!--- At the top of your form processing page (the send and/or insert function) --->
<cfparam name="form.spmchck" default="">
<cfif form.spmchck NEQ session.chk_rand>
<cfoutput>Some human readable validation message "Sorry, you need to fill in the following fields..."</cfoutput>
<cfset x = StructDelete(Session, "chk_rand")>
<cfabort>
</cfif>

All you are doing is this:

Generating a random number
Set it to a session variable
Display the number next to a field in the form, get the user to copy it over.
Check that the form.value and that the session.value are equal to each other.

If the two numbers equal each other, then it passes, if not, the session value with the random number is deleted (and thus the next attempt gives you a new random number), then the template is aborted. One could, if one were so inclined add logging and perhaps even a honey pot link or email. One could conceivably extend this to use alpha characters, or even behind the scenes arithmetic and so on.

A bot won't know it, and a human has only a very simple task to perform. And it changes every time the form is accessed, so even if they do it manually, it's labour intensive. Those with javascript enabled will get a lovely little message, and never have to see the server side validation. You can be more sophisticated about it, but this is basically the same notion of captcha.

As simple as it gets.

Reply to this Comment

Something odd happened to the code above. Reposted here with no bold tags

<!--- Generate a random number as a param (I use 4 digits) at the top of your form --->
<cfparam name="session.chk_rand" default="#NumberFormat(RandRange(0, 9999),'0000')#">

<!--- Place this field in your form, right next to the submit button --->
#session.chk_rand# enter this number here -><input name="spmchck" type="text" size="4" maxlength="4"/>

<!--- At the top of your form processing page (the send and/or insert function) --->
<cfparam name="form.spmchck" default="">
<cfif form.spmchck NEQ session.chk_rand>
<cfoutput>Some human readable validation message "Sorry, you need to fill in the following fields..."</cfoutput>
<cfset x = StructDelete(Session, "chk_rand")>
<cfabort>
</cfif>

Reply to this Comment

@Frank,

When I used your code: I get the an error that the Element ck_rand is undefined in form. Any idea why I get this message. Thanks

Reply to this Comment

Post A Comment

You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.