Skip to main content
Ben Nadel at BFusion / BFLEX 2009 (Bloomington, Indiana) with: Zach Stepek
Ben Nadel at BFusion / BFLEX 2009 (Bloomington, Indiana) with: Zach Stepek ( @zstepek )

Updated Session Management And Web Spiders & Bots

By on
Tags:

As the list of spiders that hits my site grows, I am trying to keep the session management under control. I don't want to offer spidres sessions since I don't want to have unused session variables taking up RAM on the server. But at the same time, I don't want to add a lot of processing overhead to each page just checking to see if a user is really a user or really a spider/bot. Afterall, the server does have 4 gigs or RAM. Processing speed and the user experience is more important to me than RAM usage.

But, in the interest of optimization I have combined several of my user agent checks into one Regular Expression (RegEx) search for the string "bot" on a word boundry: "bot\b". As you can see below, this takes care of 18 user agent types. I doubt that this will give me any false positives on standard browsers, but if it does, the only difference is that they will not get sessions.

In previous posts I talked about how Short-Circuit evaluation was faster than large regular expressions. I have not gone back on this. While this is a regular expression, it is not a variable-length regular expression. It is meerly a qualified standard string search (qualified by the word bountry) and is therefore very fast.

// Create a lowercase version of the user agent so we can run without
// NoCase checks.
strTempUserAgent = LCase( CGI.http_user_agent );

// Check user agent.
if (
	(NOT Len(strTempUserAgent)) OR

	// We are gonna try to optimize even a little bit more. A good number
	// of the spider names end in "bot". If we check for names that have
	// BOT ending on a word bountry, we can eliminate severl of the other
	// spider checkes. The bot\b search here takes care of the spiders
	// that are now commented out below. As you can see, it takes the
	// place of 18 different spider Find()'s.
	REFind( "bot\b", strTempUserAgent ) OR

	Find( "slurp", strTempUserAgent ) OR
	// Find( "googlebot", strTempUserAgent ) OR
	// Find( "becomebot", strTempUserAgent ) OR
	// Find( "msnbot", strTempUserAgent ) OR
	Find( "mediapartners-google", strTempUserAgent ) OR
	Find( "zyborg", strTempUserAgent ) OR
	// Find( "rufusbot", strTempUserAgent ) OR
	Find( "emonitor", strTempUserAgent ) OR
	// Find( "researchbot", strTempUserAgent ) OR
	// Find( "ip2mapbot", strTempUserAgent ) OR
	// Find( "gigabot", strTempUserAgent ) OR
	Find( "jeeves", strTempUserAgent ) OR
	// Find( "exabot", strTempUserAgent ) OR
	Find( "sbider", strTempUserAgent ) OR
	Find( "findlinks", strTempUserAgent ) OR
	Find( "yahooseeker", strTempUserAgent ) OR
	Find( "mmcrawler", strTempUserAgent ) OR
	// Find( "mj12bot", strTempUserAgent ) OR
	// Find( "outfoxbot", strTempUserAgent ) OR
	Find( "jbrowser", strTempUserAgent ) OR
	// Find( "ziggsbot", strTempUserAgent ) OR
	Find( "java", strTempUserAgent ) OR
	Find( "pmafind", strTempUserAgent ) OR
	Find( "blogbeat", strTempUserAgent ) OR
	// Find( "turnitinbot", strTempUserAgent ) OR
	Find( "converacrawler", strTempUserAgent ) OR
	Find( "ocelli", strTempUserAgent ) OR
	Find( "labhoo", strTempUserAgent ) OR
	Find( "validator", strTempUserAgent ) OR
	Find( "sproose", strTempUserAgent ) OR
	// Find( "obot", strTempUserAgent ) OR
	// Find( "myfamilybot", strTempUserAgent ) OR
	// Find( "girafabot", strTempUserAgent ) OR
	// Find( "aipbot", strTempUserAgent ) OR
	Find( "ia_archiver", strTempUserAgent ) OR
	// Find( "snapbot", strTempUserAgent ) OR
	Find( "larbin", strTempUserAgent ) OR
	Find( "psycheclone", strTempUserAgent )
	// Find( "IRLbot", strTempUserAgent )
	){

	// This application definition is for robots that do NOT need sessions.
	THIS.Name = "KinkySolutions v.1 {dev}";
	THIS.SessionManagement = false;
	THIS.SetClientCookies = false;
	THIS.ClientManagement = false;
	THIS.SetDomainCookies = false;

	// Set the flag for session use.
	REQUEST.HasSessionScope = false;

} else {

	// This application is for the standard user.
	THIS.Name = "KinkySolutions v.1 {dev}";
	THIS.SessionManagement = true;
	THIS.SetClientCookies = true;
	THIS.SessionTimeout = CreateTimeSpan(0, 0, 20, 0);
	THIS.LoginStorage = "SESSION";

	// Set the flag for session use.
	REQUEST.HasSessionScope = true;

}

Want to use code from this post? Check out the license.

Reader Comments

I believe in love. I believe in compassion. I believe in human rights. I believe that we can afford to give more of these gifts to the world around us because it costs us nothing to be decent and kind and understanding. And, I want you to know that when you land on this site, you are accepted for who you are, no matter how you identify, what truths you live, or whatever kind of goofy shit makes you feel alive! Rock on with your bad self!
Ben Nadel