Spiders Are Tricking My Session Management

Posted August 18, 2006 at 8:51 AM by Ben Nadel

Tags: ColdFusion

In order to cut down on variables that are set on the server, I attempt to turn off session management for spiders so that no session variables need to be created. I do this based on user agents and black-listed IP addresses. However, recently, I have been getting a slew of hits from what I assume are spiders that have regular user agents:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)

Since I can't use that, I thought I would black list the IP addresses, but it seems that the spider is sending a randomized remote address for each page request. The following IP addresses all came from some sort of crawler within two minutes:

24.34.69.164 (3 hits)
62.252.224.18
65.35.198.39 (2 hits)
65.188.255.128
66.180.121.138
67.181.55.113
67.81.170.135
68.252.44.67 (2 hits)
68.52.163.179 (2 hits)
68.80.35.248
69.112.175.221
69.206.239.189
70.123.212.52
71.72.88.167
72.227.146.87 (2 hits)
74.134.101.18
74.136.92.229
80.183.99.222
80.218.123.202
80.38.98.243 (3 hits)
80.99.78.112
82.156.133.68
84.29.108.242
89.156.52.247
89.98.20.160

I know that it was a crawler because they all had the same http referer, which was my home page and not all of the requested pages are available from the home page, which means the referer was being set manually. This is so irritating! Now, I have dozens upon dozens of sessions being created on the server that will last 20 minutes without being used twice. That is poor memory management.

Why is the spider doing this? I suppose this is to stop people from serving up different content based on spiders, but that is not my purpose. Having no session management does not server different content. It just turns off certain server-side tracking. Uggg.


You Might Also Be Interested In:



Reader Comments

Oct 24, 2007 at 2:40 PM // reply »
6 Comments

Can you tell if any of those "sneaky" spiders over look for robots.txt?

I've always wanted to mess around with mod rewrite or something to funnel robots.txt through CF so I can better pin down requests coming from spiders.

Of course, that ASSUMES they even bother looking for robots.txt. If they change their IP address with every request, that would make it difficult too.


Oct 24, 2007 at 3:31 PM // reply »
11,314 Comments

I have no idea. I assume they don't even bother looking at it??


Apr 8, 2008 at 9:28 PM // reply »
1 Comments

80.38.98.243 thats my web


Jun 23, 2009 at 6:37 AM // reply »
34 Comments

I usually don't bother to set a session until the user is authenticated or actually did something worth while to track, I do start sessions if an error occurs and once an error does occur I refer to the current and referring URL as well as store all variables being posted or submitted. So that I can better understand WTF the user is doing to cause the error. ( It could be my fault... but I prefer to blame the user )

As for the user agent being the same that is a little bit odd. It could be websites that are taking screenshots of your page --> ref http://www.browsershots.org or something similar?

It could also very well be an attempt to take down your site over load it.

I know I'm making this longer than what it should be but a user from a local network eg.) school, office, etc... could be using software that changes the ip each time they go to a new page... I have used a similar program when I was in school... ( changing my grades xD )

but anyways to have a little bit more ease on your session management you could always create a cookie for the user with a #createUUID()# or whatever and also store it in a database with a few other fields eg.) Ip, UUID, expires, and then check to see if it exsist and is still valid if it exist then you can continue to use the same session or not make a new one. If it doesn't exist you can always check to see if the user has cookies enabled via javascript and pass it to your coldfusion [ if i'm not mistaking a user has to have cookies enabled for a session to properly work unless your passing them through the URL ]

Well I have said enough so far... I think I will leave it at that. If I have left anything out just say something...

Hope I helped @ least 2%


Jun 23, 2009 at 8:23 AM // reply »
11,314 Comments

@Jody,

I definitely like the idea of storing multiple values to compare against on subsequent page hits. I would, however, probably just store that in the COOKIE and the SESSION to compare against; that way I don't have to hit the database each time.


Post A Comment

Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.

Please review the following issues:

Author Name:


Author Email:

Author Website:

Comment:

Supported HTML tags for formatting: <strong>bold</strong>   <em>italic</em>   <code>code</code>







  • Help Wanted - Find Your Next ColdFusion Job
Ben Nadel's Company - Epicenter Consulting Recent Blog Comments
Jun 20, 2013 at 1:09 AM
The Beauty Of The jQuery Each() Method
my html code : <html> <head> <script type="text/javascript" src="jquery.js"></script> <script type="text/javascript" src="nss.js"> ... read »
Jun 19, 2013 at 11:31 PM
Directive Link, $observe, And $watch Functions Execute Inside An AngularJS Context
@Ben, bunch to learn indeed, but thats fun part : ) ... read »
Jun 19, 2013 at 10:41 PM
Referencing ColdFusion Query Columns In A Loop Using Both Array And Dot Notation
Burdock-roots Are you going fat day by day? You need to be good for your family and make some money too. So we bring for you a best product that helps you to be more energetic every day. You will b ... read »
Jun 19, 2013 at 9:52 PM
Working With Inherited Collections In AngularJS
I recognize the applicability of your solution, and how easy it makes to share data across multiple views or even "submodules" of rather simple application. But it seems to me that it creat ... read »
Jun 19, 2013 at 9:38 PM
Directive Link, $observe, And $watch Functions Execute Inside An AngularJS Context
@Alesei, Glad you like it. Even after working with AngularJS for months, I still get a bunch of unexpected, "$digest is already in progress". So hard to debug sometimes! ... read »
Jun 19, 2013 at 9:36 PM
Working With Inherited Collections In AngularJS
@Mike, The relationship of $scope values is definitely an interesting thing! But it's not simple - it really forces you to understand prototypal inheritance, which is not at all a simple topic! Gla ... read »
Jun 19, 2013 at 9:35 PM
Experimenting With The Amazon Simple Storage Service (S3) API Using ColdFusion
@Joe, Oh, super interesting! I had only thought to url-encode the signature; but I think that's because the S3 docs actually have a special NOTE telling you to do so. It would have never occurred t ... read »
Jun 19, 2013 at 9:32 PM
Experimenting With The Amazon Simple Storage Service (S3) API Using ColdFusion
@Richard, Glad you like! Hopefully I'll have some more interesting stuff coming. This morning, I blogged a bit more about generating the pre-signed, query string authenticated URLs; but, then deeme ... read »
InVision App - Prototyping Made Beautiful With Prototyping Tools