Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at cf.Objective() 2014 (Bloomington, MN) with:

Understanding The IIS Mod-Rewrite Server Variables

By Ben Nadel on

When writing RewriteCond (rewrite condition) and RewriteRule (rewrite rule) directives in IIS Mod-Rewrite's URL rewriting configuration files, we have access to several server variables. We can get the value of these server variables using the following syntax:

%{VARIABLE_NAME}

So, for example, to get the name of the requested file name, we would use:

%{REQUEST_FILENAME}

Since none of the rewrite documentation (IIS Mod-Rewrite's or otherwise) has shed too much light on what kind of values these variables hold, I figured I would run some tests to clear up my own personal confusion. I created a RewriteCond (rewrite condition) that would run against a server variable. Then, I enabled IIS Mod-Rewrite logging and ran the RewriteCond against each of the server variables that looked like it might hold useful information.

For each of the following tests, I requested this url:

/iis_mod_rewrite2/foo/

Without bothering to show you the RewriteCond (rewrite condition), here is what I found:

REQUEST_METHOD

GET

SCRIPT_FILENAME

c:/inetpub/wwwroot/iis_mod_rewrite2/foo/

PATH_INFO

[empty string]

THE_REQUEST

GET /iis_mod_rewrite2/foo/ HTTP/1.1

REQUEST_URI

/iis_mod_rewrite2/foo/

REQUEST_FILENAME

c:/inetpub/wwwroot/iis_mod_rewrite2/foo/

DOCUMENT_ROOT

[empty string]

The above were all tested with RewriteCond directives; and, while this next one is not a "server variable", I wanted to demonstrate the point that when you use a RewriteRule (rewrite rule) rather than a RewriteCond (rewrite condition), the implicit request value that you'll be testing regular expression patterns against would be:

foo/

This is all very interesting stuff. A few of the variables look like they could all be used for some good regular expression pattern matching (the basis of most all URL rewriting); but, when running pattern matching, we should think deeply about which one we choose. Obviously, when writing RewriteRule directives, we'll just use the implicit request value (foo/); but, when we write pattern comparisons in RewriteCond directives, we'll want to select the smallest possible string to test.

Regular expressions, while extremely sexy, are also costly operations. To cut down on as much of that cost possible, the target string should be of minimal length. Looking at the server variable values above, it makes sense then to always use the REQUEST_URI when performing URL-based pattern tests. This value is about half the length of any other like-value which means that our regular expressions should run twice as fast.

While this might seem like premature optimization, remember that the URL rewrite engine is being executed for every single page request made to the server (under a given configuration file). That's a lot of processing! As such, we're going to want to add optimizations where ever possible.




Reader Comments

@Ben,

It seems that you are advising premature optimization. A couple quick regexes per page request is extremely fast - thousands or hundreds of thousands of times faster than waiting on filesystem access or database access.

A regular expression will *not* run in half the time simply because the input string is half as long. Moreover, even if it did, what difference does an extra microsecond make when each database query takes milliseconds?

Of course, there are indeed slow regexes. But that is an algorithmic flaw, and using the correct algorithm instead of a flawed algorithm (a bad regex) is always a good idea. The answer is to use the correct regex for the situation on the correct input text for the situation, not to try to squeeze a few more microseconds of performance before you have any indication that that is where your performance problems actually lie.

Nevertheless, you should almost always be matching on the protocol, the domain, the port, and the path component of the URL as expressed in REQUEST_URI (whether taken whole or split into virtual directory path and path within virtual directory). You should not be matching on the filesystem filenames or the request line as a whole.

Justice

Reply to this Comment

@Justice,

I think it is really only premature optimization IF there is additional overhead involved; meaning, it's premature if you would in a lot of extra effort just to get a bit more performance. What I'm talking about here is not putting in more effort - what I'm advocating here is simply choosing the most appropriate server variable if you need to runs regex patterns against the URL.

Your regular expression could stay exactly the same and simply switching the variable you are testing could have a performance impact. So, it's less about premature optimization and more just about making appropriate choices.

As far as the half-length = twice as fast, ok, so maybe that's not entirely accurate :) But, it is sometimes, depending on the type of pattern you are running.

Ultimately, I think we are saying the same thing though, right - that one should use the REQUEST_URI and not file paths for pattern matching?

Reply to this Comment

@Ben,

Yep, REQUEST_URI contains the right information for the job.

We both advocate, in general, choosing the most appropriate things - including server variables if you need to run regex patterns against the URL.

Justice

Reply to this Comment

I desined one web Shoping cart website.I need to implement URL rewrite methods.but i did not have access to IIS.How can i implement without IIS.Give me any IDea.

Reply to this Comment

@bharath,

If you don't have access to IIS, you need to use the PATH_INFO approach where the SES URLs come after the front controller:

index.cfm/products/something/something/

That's the only way that ColdFusion will have access to it without updating the server at a higher level.

Reply to this Comment

this is a good step by step explanation. It is difficult to find anything related to windows hosting. All are behind linux hosting

Reply to this Comment

Post A Comment

You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.