Understanding The IIS Mod-Rewrite Log File And Work Flow

Posted September 3, 2009 at 2:33 PM by Ben Nadel

Tags: ColdFusion, Search Engine Optimization

As I have just started experimenting with URL rewriting in ColdFusion using IIS Mod-Rewrite, I have a lot of learning to do. While some of my experiments are working, I am getting some very unexpected (and frustrating) results which clearly indicate that I need to get a stronger hold on the work flow executing behind each URL rewrite. To gather some insight, I turned on logging for the rewrite process. This has to be done in the main (server-wide) configuration file, not in the override configuration files:

  • # Define the location of the log file.
  • # NOTE: This location has to be one that the windows IIS user
  • # has write access to.
  •  
  • RewriteLog "C:\Inetpub\wwwroot\...\iis_mod_rewrite2\rewrite.txt"
  •  
  • # Define the level of detail that should be made available
  • # in the log file.
  • # Most detail: 9
  • # Least detail: 0 (no logging)
  •  
  • RewriteLogLevel 9

In order for IIS to be able to write to this log file, I needed to grant the IIS Windows user "write permissions" on the directory in which the log was be stored. Because I am running IIS6, this user showed up as IIS_WPG:


 
 
 

 
IIS Mod-Rewrite IIS User Permissions For Loggin Data.  
 
 
 

Once I had IIS Mod-Rewrite successfully logging web requests (which is, ALL requests on the server, by the way), I started to try different RewriteRule's to see how they were operating under the hood. The first one I set up was a simple rule that rewrites all incoming request to point to index.cfm:

  • # IIS Mod-Rewrite configuration file
  • # Turn on the rewrite rules for this access file.
  • # This will will handle all requests based off of
  • # this directory.
  •  
  • RewriteEngine On
  •  
  • # Rewrite all requests to the index.cfm file and append
  • # original request as a query string parameters: original.
  •  
  • RewriteRule ^(.+)$ index.cfm?original=$1 [L,QSA]

As you can see, no matter what request comes from the outside, all traffic will be directed to index.cfm. Furthermore, the requested sub-file path will be appended to the request as the query string parameter, "original." I have also included the QSA (Query String Append) flag such that any query string values made in the original request will also be included in the rewritten URL.

With this rule in place, I called the following URL:

./foo/bar.cfm

Now, before we go into what happens, let me tell you what I would expect to happen as someone new to URL rewriting. I would expect that I'd execute the index.cfm page and my URL scope would contain the following name-value pair:

original: foo/bar.cfm

Now, here's what really happens: Yes, the index.cfm file is executed, but my URL scope has the following unexpected name-value pair:

original: index.cfm,foo/bar.cfm

Funky! Not only am I getting the original URL in the query string, I'm also getting the rewrite URL as well (or so I think).

So what happened? If you look at my regular expression and my rewrite, it would make you think that the entire string "index.cfm,foo/bar.cfm" is what is being captured; but that can't be as index.cfm was not part of the original request! To find out more, I took a look at the log file. Here is the relevant excerpt from it (with unnecessary data removed):

init rewrite engine
with requested uri

applying pattern
'/\.htaccess$' to uri '/iis_mod_rewrite2/foo/bar.cfm'

pass through
/iis_mod_rewrite2/foo/bar.cfm

init rewrite engine
with requested uri

applying pattern
'^(.+)$' to uri 'foo/bar.cfm'

rewrite
'foo/bar.cfm' -> 'index.cfm?original=foo/bar.cfm'

split
uri=index.cfm?original=foo/bar.cfm -> uri=index.cfm, args=original=foo/bar.cfm

internal redirect
with /iis_mod_rewrite2/index.cfm [INTERNAL REDIRECT]

init rewrite engine
with requested uri

applying pattern
'^(.+)$' to uri 'index.cfm'

rewrite
'index.cfm' -> 'index.cfm?original=index.cfm'

split
uri=index.cfm?original=index.cfm -> uri=index.cfm, args=original=index.cfm

internal redirect
with /iis_mod_rewrite2/index.cfm [INTERNAL REDIRECT]

Having never seen a log file like this before, it took me a good hour (and plenty of Twitter help from Emmet McGovern) to really wrap my head around what was going on. Immediately, however, I noticed that there were two calls to something marked [INTERNAL REDIRECT]. That probably had something to do with my funky resultant query string, but how? After all, I only had one rewrite rule - how could two redirects be executing? I was especially perplexed considering that I was using the "L" (Last Rule) flag on my rewrite condition.

What I came to realize was that each rewrite performed by IIS Mod-Rewrite gets reprocessed using the rewritten URL. As such, the first rewrite rule was executed on the original URL:

foo/bar.cfm

At the end of the first pass, it was rewritten to the URL:

index.cfm?original=foo/bar.cfm

At this point, the newly rewritten URL was reprocessed (sub-requested??) by the rules engine and ends up creating this new URL:

index.cfm?original=index.cfm

While this URL only has the one, "original," query string parameter, because I am using the QSA (Query String Append) flag, the query string created in the first URL rewrite (original=foo/bar.cfm) is now appended to the query string created by the second rewrite rule such that I end up with this URL:

index.cfm?original=index.cfm&original=foo/bar.cfm

Now, because I have a query parameter named, "original," showing up twice in the resultant query string, the two query parameter values get collapsed into a single, comma-delimited list:

original: index.cfm,foo/bar.cfm

As a note, if I removed the QSA (Query String Append) flag, I ended up with the name-value pair:

original: index.cfm

This happens because only the final query string created by the second rewrite rule is kept.

Ok, so tracing the logged rewrites helps us understand where the unexpected query string value comes from; but, it doesn't really help us understand why there are specifically two redirects being executed. After all, if there are no RewriteCond (rewrite conditions) on the RewriteRule, why execute 2 times, but not 3 times - or 4 times - or N times?

After some more reading and some back and forth on Twitter, I came up with a theory as to what was going on. Normally, with a URL rewriting system, something like this would create an endless loop of rewriting. To stop this from happening, you might set a limit to the number of rewrites that can take place using the MaxRewrites condition. From what I can gather, however, IIS Mod-Rewrite does not support this rewrite option; as such, I have to assume that it has built its rewrite engine with enough intelligence to see that it was about to enter an endless loop (rewrite result matches incoming URL) and halts the process. That would explain the two rewrite executions, but not a third.

In my experimentation, I tried using the NS (Do Not allow internal Subrequests) flag to prevent the first rewritten rule from being re-evaluated; but, from the IIS Mod-Rewrite documentation, it would appear that ISAPI does not support custom control flow at this level. If that interpretation is correct, then it makes even more sense as to why the rewrite engine was built to One) Always re-evaluate and, Two) Halt immediately at the beginning of an endless loop.

I am simultaneously new to all of the URL rewriting concepts as well as the IIS Mod-Rewrite product, so forgive me if there are any blatant misunderstandings in this explanation of the rewrite log file. It took me a number of hours and a lot of help to even get this far, so any constructive criticism would be greatly appreciated. From what I have concluded thus far, however, I have been able to more accurately predict other rewrite outcomes, which I'll address in later blog posts.

NOTE: Always remember to turn Off logging when you are not using it. Since every single web request (including IMG, CSS, and SCRIPT files) gets logged to this text file, it gets very big, very fast.




Reader Comments

Sep 3, 2009 at 2:50 PM // reply »
5 Comments

You look at coldcourse at all, Ben?


Sep 3, 2009 at 2:57 PM // reply »
10,640 Comments

@SIGEPJEDI,

What is ColdCourse?


Sep 3, 2009 at 3:10 PM // reply »
113 Comments

@Ben

ColdCourse is an initial stab at a url-routing system for ColdFusion. It is the basis of various url-routing systems in various ColdFusion frameworks.

Unfortunately, it is not up to par with Django's or Rails' url-routing systems. It is also rather unRESTful, preferring the convenient familiarity of query-string parameters re-encoded as extra path info (the part of the path after /index.cfm) a la "/index.cfm?var1=val1&var2=val2" -> "/index.cfm/var1/val1/var2/val2".

To achieve a complex, resource-oriented website, you will probably need to roll out your own url-routing system, or help build one modelling it after the one in Rails or Django. However, this may not apply to most simple web apps.

Justice


Sep 3, 2009 at 3:14 PM // reply »
10,640 Comments

@Justice,

Ah ok. Yeah, I'm gonna shy away from using path_info to get at the rewrite stuff. That's just a personal opinion. Something about it just never sat right with me.


Sep 3, 2009 at 3:24 PM // reply »
113 Comments

@Ben,

Can you put a finger on it?

Justice


Sep 3, 2009 at 3:27 PM // reply »
10,640 Comments

@Justice,

Ha ha, no, can't put a finger on it :)


Sep 3, 2009 at 9:02 PM // reply »
53 Comments

@Ben,

Just a stab in the dark...

Can't you get the RegEx to exclude the index.cfm match? There is probably a more elegant solution, however if it works it could be used until such solution presents itself.


Sep 4, 2009 at 8:04 AM // reply »
10,640 Comments

@Andrew,

Yeah, I think are correct. I wasn't worrying about the rule, per-say, I just wanted to see what was going on the hood. I'll write some more on this topic with more rule-based thoughts.


Sep 4, 2009 at 11:39 AM // reply »
9 Comments

Small typo in post title: Understating -> should be Understanding? Or perhaps I don't understand :)


Sep 4, 2009 at 11:41 AM // reply »
10,640 Comments

@Ciaran,

Ooops, thanks for catching that! Definitely do not want to understate :)


Post A Comment

Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.

Please review the following issues:

Author Name:


Author Email:

Author Website:

Comment:

Supported HTML tags for formatting: <strong>bold</strong>   <em>italic</em>   <code>code</code>







  • Help Wanted - Find Your Next ColdFusion Job
InVision App - Prototyping Made Beautiful With Prototyping Tools Ben Nadel's Company - Epicenter Consulting Recent Blog Comments
Feb 12, 2012 at 3:37 AM
Learning ColdFusion 8: CFImage Part III - Watermarks And Transparency
Hi Ben, Just to ask currently it is placed bottom right corner, if i need to replace the same rendered image on the bottom left side or in the bottom center, how that can be calculated. bottom ce ... read »
Feb 11, 2012 at 9:29 PM
Use jQuery's SlideDown() With Fixed-Width Elements To Prevent Jumping
I can't say how glad I am that I found your post. Thank you very much. ... read »
Feb 10, 2012 at 7:21 PM
jQuery AJAX Strips Script Tags And Inserts Them After Parent-Most Elements
Update! Instead of $(eval(options.insertAfter)).after(data['insertData']); I now use: var ajaxNode = document.createElement('span'); var parent = $(eval(options.insertAfter))[0].parentNode; ... read »
Feb 10, 2012 at 6:18 PM
jQuery AJAX Strips Script Tags And Inserts Them After Parent-Most Elements
encountered this same, what I consider, jQuery bug last week. I'm building a site in which I load some content via AJAX. This content contains Linkedin share button placeholders which Linkedin API ne ... read »
Feb 10, 2012 at 11:30 AM
Cross-Origin Resource Sharing (CORS) AJAX Requests Between jQuery And Node.js
After you understand the concepts here, this is an awesome cheatsheet for enabling CORS in just about anything http://enable-cors.org/ ... read »
JM
Feb 10, 2012 at 9:10 AM
My Safari Browser SQLite Database Hello World Example
@Amy, Here is a very good tutorial on how to use JOIN: http://www.sqltutorial.org/sqljoin-innerjoin.aspx ... read »
Feb 10, 2012 at 4:42 AM
Building A Twitter-Inspired RESTful API Architecture In ColdFusion
This is great, very useful Ben. I spotted a small typo in the api.cgm listing: <cfthrow type="Unauthroized" /> Cheers Stefan ... read »
Feb 9, 2012 at 10:35 PM
CFDirectory Filtering Uses Pipe Character For Multiple Filters (Thanks Steve Withington)
I was wondering if there would be a filter you could apply so that you got everything but what you included in the filter. As in show me all docs that are not a .pdf. ... read »