The Anti-Spam Technique Used On My ColdFusion Job Board

By Ben Nadel

Published 2011-03-18 in ColdFusion — Comments (23)

Over the past couple of months, a few people have asked me how the anti-spam technique works on my ColdFusion job board. It's actually quite simple - I create a comma-delimited list of numbers and then I replace one of those numbers with an input field. When submitting the form, the user has to complete the set, filling in the missing number.

This isn't a particularly strong anti-spam approach; it doesn't check any of the content for spam-related information. If a user comes along and spams the site manually, this won't help you. It's simply an easy way to help fight automated form submission by spiders and bots.

Every time the page loads, we select a random number as our anti-spam value. Then, we create a key against which we can compare the user-submitted answer. To keep things as simple as possible, I am creating a key based on multiplications of 113. This way, rather than using encryption and decryption, we can simply use the modulus operator to find the original anti-spam value.

  
          <!--- Param out form values. --->
        
          <cfparam name="form.submitted" type="boolean" default="false" />
        
          <cfparam name="form.name" type="string" default="" />
        
          <!--- Spam inputs. --->
        
          <cfparam name="form.key" type="numeric" default="0" />
        
          <cfparam name="form.value" type="string" default="" />
        
          <!--- Check to see if the form has been submitted. --->
        
          <cfif form.submitted>
        
          	<!---
        
          		Check to see if the anti-spam value is correct. The key
        
          		that was submitted back was a multiple of 113 plus out
        
          		anti-spam value. As such, we can use the MODULUS operator
        
          		to match the key against the submitted value.
        
          	--->
        
          	<cfif ((form.key % 113) neq form.value)>
        
          		<!--- The key did not match. This is a bot. --->
        
          		<p>
        
          			Get your filthy paws off me you damned dirty bot!
        
          		</p>
        
          		<!--- For this demo, just quit the page request. --->
        
          		<cfabort />
        
          	</cfif>
        
          </cfif>
        
          <!--- ----------------------------------------------------- --->
        
          <!--- ----------------------------------------------------- --->
        
          <!---
        
          	Select a new anti-spam value. We are going to present this
        
          	number as an input box between within a list of values between
        
          	1 and 10. We don't want it to be on the outliers, so only
        
          	select from an inner range of values.
        
          --->
        
          <cfset antiSpamNumber = randRange( 2, 9 ) />
        
          <!---
        
          	When we pass the form through, we need to pass a key that will
        
          	help us determine if the submitted anti-spam value is valid. To
        
          	keep things as simple as possible, we'll just add our anti-spam
        
          	value to a multiple of 113. This way, we can use the modulus
        
          	operator to figure out the original value.
        
          --->
        
          <cfset antiSpamKey = (
        
          	(113 * randRange( 1, 20 )) +
        
          	antiSpamNumber
        
          	) />
        
          <!---
        
          	Now that we have our anti-spam number, let's create the list of
        
          	values + input that we will display on the form. To do this, we
        
          	are simply going to replace our anti-spam number with an INPUT
        
          	field within the list of values.
        
          --->
        
          <cfset antiSpamList = replace(
        
          	"1, 2, 3, 4, 5, 6, 7, 8, 9, 10",
        
          	antiSpamNumber,
        
          	"<input type='text' name='value' size='2' />",
        
          	"one"
        
          	) />
        
          <!--- ----------------------------------------------------- --->
        
          <!--- ----------------------------------------------------- --->
        
          <!--- Reset the output buffer and set the mime type. --->
        
          <cfcontent type="text/html" />
        
          <cfoutput>
        
          	<!DOCTYPE html>
        
          	<html>
        
          	<head>
        
          		<title>Simple List-Based Anti-Spam Technique</title>
        
          	</head>
        
          	<body>
        
          		<h1>
        
          			Simple List-Based Anti-Spam Technique
        
          		</h1>
        
          		<form action="#cgi.script_name#" method="post">
        
          			<!--- Form submission flag. --->
        
          			<input type="hidden" name="submitted" value="true" />
        
          			<!--- Out anti-spam key. --->
        
          			<input type="hidden" name="key" value="#antiSpamKey#" />
        
          			<p>
        
          				Please enter your name:<br />
        
          				<input type="text" name="name" />
        
          			</p>
        
          			<!--- Our anti-spam input. --->
        
          			<p>
        
          				Please complete this list:<br />
        
          				<!---
        
          					This is our list of values including the one
        
          					input value that we need.
        
          				--->
        
          				#antiSpamList#
        
          			</p>
        
          			<p>
        
          				<input type="submit" value="Submit" />
        
          			</p>
        
          		</form>
        
          	</body>
        
          	</html>
        
          </cfoutput>

view raw code-1.cfm hosted with ❤ by GitHub

Like I said before, this approach doesn't take form content into account. If a user fills out your form and submits content about where to find cheap computer RAM, this is not gonna help you. This kind of anti-spam is here only to help against automated, bot-driven form submissions. And, so far, it's been working out quite nicely.

Want to use code from this post? Check out the license.

Short link: https://bennadel.com/2149

Reader Comments

Paul Carney Mar 18, 2011 at 9:59 AM

7 Comments

As a math person, I like the approach! As you say, this won't stop everything, but definitely will help keep most of the rif-raf out! :)

Jose Galdamez Mar 18, 2011 at 10:25 AM

50 Comments

If you want to keep out the manual spammers go for a CAPTCHA like this.

http://farm3.static.flickr.com/2174/2268237733_cda4a1dbb3.jpg

:-)

Jon Mar 18, 2011 at 10:28 AM

3 Comments

I like your new blog approach - video and code. Before you know it, you will be on Adobe TV.

Ben Nadel Mar 18, 2011 at 10:48 AM

16,066 Comments

@Paul,

Thanks - seems like a really light-weight way of doing this.

@Jose,

Ha ha ha, clearly the answer is... oh wait, gotta run to a meeting.

@Jon,

Thanks! I think the video + code is a really nice combination. Especially because the video allows you to tell the story around the code; plus, you get to start zooming in - so first you get the story, then you get the walk through, then you start to peer down into the code. You never have to mentally parse anything until you have the previous concept, so to speak.

Ross Mar 18, 2011 at 11:08 AM

7 Comments

I use strict JS field verification and CFFormProtect/Akismet - and I've obfuscated the links to the forms with JS too.

The result is spambot can't bypass the verification as they won't see the existence of a form without JS.

I still get the occasional spammed form from a human in China / Africa / wherever labour's cheap but it always gets caught by CFFormProtect so I get a spam report but the form recipient is none the wiser.

Job done.

Ben Nadel Mar 18, 2011 at 11:10 AM

16,066 Comments

@Ross,

Human spammers are so annoying. I have to admit, I've occasionally deleted valid comments because I couldn't determine if they were spam or not. Spam really has hurt the web.

Ross Mar 18, 2011 at 11:32 AM

7 Comments

"...I've occasionally deleted valid comments because I couldn't determine if they were spam or not."

I doubt it helps when natural English-speakers struggle with the Queen's. Not too much trouble on a site like this, I guess, but I immediately move on to the next post on sites where someone has tried using 'txt spk' or ALL CAPS.

Ben Nadel Mar 18, 2011 at 11:39 AM

16,066 Comments

@Ross,

I try not to concentrate on the language they use too much; I know we all come for different places. Typically, when I get suspicious when the the comments seems slightly off-topic AND the user is linking to a non-personal site (like a domain name registry or an IT-service company). At that point, I'll typically look at the other comments that they have left; and, if the patterns emerges, I delete the comment (and maybe some of the earlier ones as well).

It's really mentally stressful :)

Ross Mar 18, 2011 at 12:15 PM

7 Comments

I can imagine!

JF Mar 19, 2011 at 1:13 PM

9 Comments

We use what is called "honeypot captcha".

Since most spambot don't parse and evaluate CSS (they aren't that smart yet) we leave a hidden input field on the form. On form submission we make sure it's empty. If it's filled you can be pretty sure it's a bot.

All you need is a warning message in case a human has CSS disabled indicating that they should leave the field blank if they see it.

Worked so far and it's completely non intrusive to just about every user.

Randall Mar 21, 2011 at 4:41 PM

167 Comments

Great comments from all.

@Ben, I think you should weed out all spammers who use the words: labour, theatre, colour, flavour, honour, neighbour, rumour, labour, humour, favourite, honourable, behaviourism & saviour.

After all, FireFox sees them as misspelled (apparently "FireFox" is also a misspelling) so therefore they must be from a spammer, right Ross? ;-) j/k!

Randall 'Round here y'all speak 'Merican!"

Ben Nadel Mar 21, 2011 at 5:00 PM

16,066 Comments

@JF,

Agreed - I definitely make use of the honey pot approach. Actually, this blog comment form uses that approach.

@Randall,

Ha ha ha :)

WebManWalking Mar 23, 2011 at 12:42 PM

290 Comments

@Ben,

It's good that you're guarding against spam. Did you know that yours is the 15,683 site in the world?

http://www.serverinsiders.com/domain/bennadel-com.html

Ben Nadel Mar 24, 2011 at 9:37 PM

16,066 Comments

@WebManWalking,

Ha ha ha, that's bananas. I don't even know how they can figure that stuff out (or if its even meaningful).

WebManWalking Mar 24, 2011 at 10:17 PM

290 Comments

@Ben,

Here's a "bookmarklet" you can use to check out lots of sites with serverinsiders:

javascript:if((p=location.hostname.split(".")).length>=2)void(window.open("http://www.serverinsiders.com/domain/"+p[p.length-2]+"-"+p[p.length-1]+".html"));

Instructions:
(1) Edit > Copy the string above, the whole thing, from javascript: to semicolon inclusive.
(2) Open bookmarks, find where you want to put it, say New Bookmark.
(3) For the name, say "Server Stats", or some such.
(4) For the URL, Edit > Paste the string above.
(5) Save, close bookmarks.

Then browse to any site. If you'd like to see its serverinsiders stats, just select the Server Stats bookmark. The browser will open a new window with the URL formed to look at the stats for the site you were looking at.

Of course, you may get a 404 Not Found if they haven't accumulated stats for the site. And you can't go 3 levels deep (for example, there isn't any "myblog-blogspot-com.html"). Serverinsiders goes only 2 levels deep, including the top level domain name.

WebManWalking Mar 25, 2011 at 1:03 PM

290 Comments

@Ben,

P.S.: google-com.html says that Google's the number 1 site on the Internet. Not a surprise, but it seems somehow odd to know something like that. It reminded me of your "How would they know?" response.

It appears to be based on traffic analysis and hits.

Yahoo's number 4. Microsoft's number 20. bing's number 22. Apple's number 54. CNN's number 61. c|net's number 66. whitehouse.gov's number 3363. The more I try to think of sites that might be number 2, the further off I get.

Ben Nadel Mar 25, 2011 at 2:50 PM

16,066 Comments

@WebManWalking,

Good bookmarklet - Fandango is 849 :) Holy cow, AOL is 42... really? really? AOL? maybe in 1995 ;)

JF Mar 25, 2011 at 3:21 PM

9 Comments

@WebManWalking Facebook.com is number two.

WebManWalking Mar 25, 2011 at 8:43 PM

290 Comments

@JF: Makes sense. Thanks!

WebManWalking Mar 25, 2011 at 8:46 PM

290 Comments

@Ben: You're forgetting about the newbie, non-tech crowd. AOL still works for them.

What surprises the heck out of me is that I overheard the familiar "You've Got Mail!" sound at work today.

WebManWalking Mar 25, 2011 at 9:11 PM

290 Comments

@JT,

Hey I just noticed the Similar Traffic Domains on the left side! #4, groups.yahoo.com, proves that they DO go 3 deep, just not for everyone. (Even maps.google.com didn't rate having their own page. Oh well.)

So @JT and @Ben, here's another bookmarklet:

"Server Stats 3 Deep":

javascript:if((p=location.hostname.split(".")).length>=3)void(window.open("http://www.serverinsiders.com/domain/"+p[p.length-3]+"-"+p[p.length-2]+"-"+p[p.length-1]+".html"));

Pretty trivial modification, but tested.

Ben Nadel Mar 26, 2011 at 6:17 PM

16,066 Comments

@JF,

Ah Facebook - that makes sense. I can't believe I didn't think of that.

@WebManWalking,

Still blows my mind. I do know people who still use their AOL email. Makes me think of the Oatmeal:

http://theoatmeal.com/comics/email_address

... that guy is just brilliant :D

lichenplanus Aug 9, 2012 at 6:00 AM

1 Comments

You have a very inspiring way of exploring and sharing your thoughts. It is very uncommon nowadays, lots of sites and blogs having copy pasted or rewritten info. But here, no doubt, info is original and very well structured. Keep it up. !!

Oh my chickens, this post is old!

Hit me up on Twitter if you want to discuss it further.

	<!--- Param out form values. --->
	<cfparam name="form.submitted" type="boolean" default="false" />
	<cfparam name="form.name" type="string" default="" />

	<!--- Spam inputs. --->
	<cfparam name="form.key" type="numeric" default="0" />
	<cfparam name="form.value" type="string" default="" />


	<!--- Check to see if the form has been submitted. --->
	<cfif form.submitted>


	<!---
	Check to see if the anti-spam value is correct. The key
	that was submitted back was a multiple of 113 plus out
	anti-spam value. As such, we can use the MODULUS operator
	to match the key against the submitted value.
	--->
	<cfif ((form.key % 113) neq form.value)>

	<!--- The key did not match. This is a bot. --->
	<p>
	Get your filthy paws off me you damned dirty bot!
	</p>

	<!--- For this demo, just quit the page request. --->
	<cfabort />

	</cfif>


	</cfif>


	<!--- ----------------------------------------------------- --->
	<!--- ----------------------------------------------------- --->


	<!---
	Select a new anti-spam value. We are going to present this
	number as an input box between within a list of values between
	1 and 10. We don't want it to be on the outliers, so only
	select from an inner range of values.
	--->
	<cfset antiSpamNumber = randRange( 2, 9 ) />

	<!---
	When we pass the form through, we need to pass a key that will
	help us determine if the submitted anti-spam value is valid. To
	keep things as simple as possible, we'll just add our anti-spam
	value to a multiple of 113. This way, we can use the modulus
	operator to figure out the original value.
	--->
	<cfset antiSpamKey = (
	(113 * randRange( 1, 20 )) +
	antiSpamNumber
	) />

	<!---
	Now that we have our anti-spam number, let's create the list of
	values + input that we will display on the form. To do this, we
	are simply going to replace our anti-spam number with an INPUT
	field within the list of values.
	--->
	<cfset antiSpamList = replace(
	"1, 2, 3, 4, 5, 6, 7, 8, 9, 10",
	antiSpamNumber,
	"<input type='text' name='value' size='2' />",
	"one"
	) />


	<!--- ----------------------------------------------------- --->
	<!--- ----------------------------------------------------- --->


	<!--- Reset the output buffer and set the mime type. --->
	<cfcontent type="text/html" />

	<cfoutput>

	<!DOCTYPE html>
	<html>
	<head>
	<title>Simple List-Based Anti-Spam Technique</title>
	</head>
	<body>

	<h1>
	Simple List-Based Anti-Spam Technique
	</h1>

	<form action="#cgi.script_name#" method="post">


	<!--- Form submission flag. --->
	<input type="hidden" name="submitted" value="true" />

	<!--- Out anti-spam key. --->
	<input type="hidden" name="key" value="#antiSpamKey#" />


	<p>
	Please enter your name:<br />

	<input type="text" name="name" />
	</p>


	<!--- Our anti-spam input. --->
	<p>
	Please complete this list:<br />

	<!---
	This is our list of values including the one
	input value that we need.
	--->
	#antiSpamList#
	</p>

	<p>
	<input type="submit" value="Submit" />
	</p>


	</form>

	</body>
	</html>

	</cfoutput>