Understanding The IIS Mod-Rewrite Server Variables

Posted September 3, 2009 at 8:26 PM

Tags: ColdFusion, Search Engine Optimization

When writing RewriteCond (rewrite condition) and RewriteRule (rewrite rule) directives in IIS Mod-Rewrite's URL rewriting configuration files, we have access to several server variables. We can get the value of these server variables using the following syntax:

%{VARIABLE_NAME}

So, for example, to get the name of the requested file name, we would use:

%{REQUEST_FILENAME}

Since none of the rewrite documentation (IIS Mod-Rewrite's or otherwise) has shed too much light on what kind of values these variables hold, I figured I would run some tests to clear up my own personal confusion. I created a RewriteCond (rewrite condition) that would run against a server variable. Then, I enabled IIS Mod-Rewrite logging and ran the RewriteCond against each of the server variables that looked like it might hold useful information.

For each of the following tests, I requested this url:

/iis_mod_rewrite2/foo/

Without bothering to show you the RewriteCond (rewrite condition), here is what I found:

REQUEST_METHOD

GET

SCRIPT_FILENAME

c:/inetpub/wwwroot/iis_mod_rewrite2/foo/

PATH_INFO

[empty string]

THE_REQUEST

GET /iis_mod_rewrite2/foo/ HTTP/1.1

REQUEST_URI

/iis_mod_rewrite2/foo/

REQUEST_FILENAME

c:/inetpub/wwwroot/iis_mod_rewrite2/foo/

DOCUMENT_ROOT

[empty string]

The above were all tested with RewriteCond directives; and, while this next one is not a "server variable", I wanted to demonstrate the point that when you use a RewriteRule (rewrite rule) rather than a RewriteCond (rewrite condition), the implicit request value that you'll be testing regular expression patterns against would be:

foo/

This is all very interesting stuff. A few of the variables look like they could all be used for some good regular expression pattern matching (the basis of most all URL rewriting); but, when running pattern matching, we should think deeply about which one we choose. Obviously, when writing RewriteRule directives, we'll just use the implicit request value (foo/); but, when we write pattern comparisons in RewriteCond directives, we'll want to select the smallest possible string to test.

Regular expressions, while extremely sexy, are also costly operations. To cut down on as much of that cost possible, the target string should be of minimal length. Looking at the server variable values above, it makes sense then to always use the REQUEST_URI when performing URL-based pattern tests. This value is about half the length of any other like-value which means that our regular expressions should run twice as fast.

While this might seem like premature optimization, remember that the URL rewrite engine is being executed for every single page request made to the server (under a given configuration file). That's a lot of processing! As such, we're going to want to add optimizations where ever possible.

Post Comment  |  Ask Ben  |  Permalink  |  Other Searches  |  Print Page




Learning ColdFusion 9 - ColdFusion 9 tutorials, samples, examples, demos

Reader Comments

Sep 4, 2009 at 8:17 AM // reply »
79 Comments

@Ben,

It seems that you are advising premature optimization. A couple quick regexes per page request is extremely fast - thousands or hundreds of thousands of times faster than waiting on filesystem access or database access.

A regular expression will *not* run in half the time simply because the input string is half as long. Moreover, even if it did, what difference does an extra microsecond make when each database query takes milliseconds?

Of course, there are indeed slow regexes. But that is an algorithmic flaw, and using the correct algorithm instead of a flawed algorithm (a bad regex) is always a good idea. The answer is to use the correct regex for the situation on the correct input text for the situation, not to try to squeeze a few more microseconds of performance before you have any indication that that is where your performance problems actually lie.

Nevertheless, you should almost always be matching on the protocol, the domain, the port, and the path component of the URL as expressed in REQUEST_URI (whether taken whole or split into virtual directory path and path within virtual directory). You should not be matching on the filesystem filenames or the request line as a whole.

Justice


Sep 4, 2009 at 8:25 AM // reply »
6,516 Comments

@Justice,

I think it is really only premature optimization IF there is additional overhead involved; meaning, it's premature if you would in a lot of extra effort just to get a bit more performance. What I'm talking about here is not putting in more effort - what I'm advocating here is simply choosing the most appropriate server variable if you need to runs regex patterns against the URL.

Your regular expression could stay exactly the same and simply switching the variable you are testing could have a performance impact. So, it's less about premature optimization and more just about making appropriate choices.

As far as the half-length = twice as fast, ok, so maybe that's not entirely accurate :) But, it is sometimes, depending on the type of pattern you are running.

Ultimately, I think we are saying the same thing though, right - that one should use the REQUEST_URI and not file paths for pattern matching?


Sep 4, 2009 at 11:23 AM // reply »
79 Comments

@Ben,

Yep, REQUEST_URI contains the right information for the job.

We both advocate, in general, choosing the most appropriate things - including server variables if you need to run regex patterns against the URL.

Justice


Sep 24, 2009 at 5:57 AM // reply »
1 Comments

I desined one web Shoping cart website.I need to implement URL rewrite methods.but i did not have access to IIS.How can i implement without IIS.Give me any IDea.


Sep 24, 2009 at 9:10 AM // reply »
6,516 Comments

@bharath,

If you don't have access to IIS, you need to use the PATH_INFO approach where the SES URLs come after the front controller:

index.cfm/products/something/something/

That's the only way that ColdFusion will have access to it without updating the server at a higher level.


Nov 11, 2009 at 12:13 AM // reply »
1 Comments

this is a good step by step explanation. It is difficult to find anything related to windows hosting. All are behind linux hosting


Post Comment  |  Ask Ben

Recent Blog Comments
Nov 21, 2009 at 5:15 PM
Using ColdFusion Structures To Remove Duplicate List Values
@Jose Galdamez, Oh heh yeah I didn't paste the whole code. I should have defined the vars -- my bad. It's fixed thou. Thanks. ... read »
Nov 21, 2009 at 4:49 PM
Styling The ColdFusion 8 WriteToBrowser CFImage Output
Great work yet again Ben! Whilst I didn't use this whole code, I copied some of your regex code for a similar problem with the lack of an alt attribute and unescaped ampersands in CFIMAGE for Railo 3 ... read »
Nov 21, 2009 at 1:13 PM
My First ColdFusion Builder Extension - Encrypting And Decrypting CFM / CFC Files
@Ben, Because I am pedantic, I just want to make sure that everyone knows there is absolutely no encryption going on. There is only encoding and obfuscation. The cfencode tool only obfuscates your C ... read »
Nov 21, 2009 at 12:28 PM
Using ColdFusion Structures To Remove Duplicate List Values
@Jody I can't seem to get your code sample to work. If you are still having problems, try this code out and see if it gets you what you wanted. <!--- Comma delimited list with various duplicates ... read »
Nov 21, 2009 at 11:03 AM
Groovy Operator Overloading Does Not Work In The ColdFusion Context
Hi Ben, Thanks for this informative post. Now I am reading ur old posts too ... read »
Nov 21, 2009 at 10:56 AM
HostMySite.com Has The Best ColdFusion Hosting
@Mehul, Yes very nice people, however several downtimes per day which was not acceptable. Hence we had to move out. I am glad you are having good luck with them so far. ... read »
Nov 20, 2009 at 11:32 PM
Five Months Without Hungarian Notation And I'm Loving It
I've used headless camel case for years for not only ColdFusion variables, but also SQL tables and fields... pretty much everything involving code. I also subscribe to the "don't abbreviate and clea ... read »