Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at cf.Objective() 2010 (Minneapolis, MN) with: Simon Free

Understanding The IIS Mod-Rewrite Server Variables

By Ben Nadel on

When writing RewriteCond (rewrite condition) and RewriteRule (rewrite rule) directives in IIS Mod-Rewrite's URL rewriting configuration files, we have access to several server variables. We can get the value of these server variables using the following syntax:

%{VARIABLE_NAME}

So, for example, to get the name of the requested file name, we would use:

%{REQUEST_FILENAME}

Since none of the rewrite documentation (IIS Mod-Rewrite's or otherwise) has shed too much light on what kind of values these variables hold, I figured I would run some tests to clear up my own personal confusion. I created a RewriteCond (rewrite condition) that would run against a server variable. Then, I enabled IIS Mod-Rewrite logging and ran the RewriteCond against each of the server variables that looked like it might hold useful information.

For each of the following tests, I requested this url:

/iis_mod_rewrite2/foo/

Without bothering to show you the RewriteCond (rewrite condition), here is what I found:

REQUEST_METHOD

GET

SCRIPT_FILENAME

c:/inetpub/wwwroot/iis_mod_rewrite2/foo/

PATH_INFO

[empty string]

THE_REQUEST

GET /iis_mod_rewrite2/foo/ HTTP/1.1

REQUEST_URI

/iis_mod_rewrite2/foo/

REQUEST_FILENAME

c:/inetpub/wwwroot/iis_mod_rewrite2/foo/

DOCUMENT_ROOT

[empty string]

The above were all tested with RewriteCond directives; and, while this next one is not a "server variable", I wanted to demonstrate the point that when you use a RewriteRule (rewrite rule) rather than a RewriteCond (rewrite condition), the implicit request value that you'll be testing regular expression patterns against would be:

foo/

This is all very interesting stuff. A few of the variables look like they could all be used for some good regular expression pattern matching (the basis of most all URL rewriting); but, when running pattern matching, we should think deeply about which one we choose. Obviously, when writing RewriteRule directives, we'll just use the implicit request value (foo/); but, when we write pattern comparisons in RewriteCond directives, we'll want to select the smallest possible string to test.

Regular expressions, while extremely sexy, are also costly operations. To cut down on as much of that cost possible, the target string should be of minimal length. Looking at the server variable values above, it makes sense then to always use the REQUEST_URI when performing URL-based pattern tests. This value is about half the length of any other like-value which means that our regular expressions should run twice as fast.

While this might seem like premature optimization, remember that the URL rewrite engine is being executed for every single page request made to the server (under a given configuration file). That's a lot of processing! As such, we're going to want to add optimizations where ever possible.




Reader Comments

@Ben,

It seems that you are advising premature optimization. A couple quick regexes per page request is extremely fast - thousands or hundreds of thousands of times faster than waiting on filesystem access or database access.

A regular expression will *not* run in half the time simply because the input string is half as long. Moreover, even if it did, what difference does an extra microsecond make when each database query takes milliseconds?

Of course, there are indeed slow regexes. But that is an algorithmic flaw, and using the correct algorithm instead of a flawed algorithm (a bad regex) is always a good idea. The answer is to use the correct regex for the situation on the correct input text for the situation, not to try to squeeze a few more microseconds of performance before you have any indication that that is where your performance problems actually lie.

Nevertheless, you should almost always be matching on the protocol, the domain, the port, and the path component of the URL as expressed in REQUEST_URI (whether taken whole or split into virtual directory path and path within virtual directory). You should not be matching on the filesystem filenames or the request line as a whole.

Justice

@Justice,

I think it is really only premature optimization IF there is additional overhead involved; meaning, it's premature if you would in a lot of extra effort just to get a bit more performance. What I'm talking about here is not putting in more effort - what I'm advocating here is simply choosing the most appropriate server variable if you need to runs regex patterns against the URL.

Your regular expression could stay exactly the same and simply switching the variable you are testing could have a performance impact. So, it's less about premature optimization and more just about making appropriate choices.

As far as the half-length = twice as fast, ok, so maybe that's not entirely accurate :) But, it is sometimes, depending on the type of pattern you are running.

Ultimately, I think we are saying the same thing though, right - that one should use the REQUEST_URI and not file paths for pattern matching?

@Ben,

Yep, REQUEST_URI contains the right information for the job.

We both advocate, in general, choosing the most appropriate things - including server variables if you need to run regex patterns against the URL.

Justice

I desined one web Shoping cart website.I need to implement URL rewrite methods.but i did not have access to IIS.How can i implement without IIS.Give me any IDea.

@bharath,

If you don't have access to IIS, you need to use the PATH_INFO approach where the SES URLs come after the front controller:

index.cfm/products/something/something/

That's the only way that ColdFusion will have access to it without updating the server at a higher level.