ColdFusion RE NoCase Functions vs. Case Insensitive RegEx Flag

Posted October 14, 2006 at 11:03 AM by Ben Nadel

Tags: ColdFusion

Just out of curiosity, I want to see to see if there was any speed difference between using the NoCase regular expression functions built into ColdFusion (REFindNoCase() and REReplaceNoCase()) and using the standard regular expression functions (REFind() and REReplace()) using the regular expression case-insensitive flag (?i). To test this, I set up a large string and replaced out all of the words.

First, I had to set up the large test string. It has to be large because ColdFusion is so freakin awesome that everything it does on small data is insanely fast.

  • <!--- Set up a test string. --->
  • <cfsavecontent variable="strText">
  • Down the road, in a gym far away
  • A young man was heard to say,
  • "No matter what I do, my legs won't grow!"
  • He tried leg extensions, leg curls, leg presses too.
  • Trying to cheat, these sissy workouts he'd do!
  • From the corner of the gym where the big guys train,
  • Through a cloud of chalk and the midst of pain,
  • Where the big iron rides high, and threatens lives,
  • Where the noise is made with big forty-fives,
  • A deep voice bellowed as he wrapped his knees,
  • A very big man with legs like trees,
  • Laughing as he snatched another plate from the stack,
  • Chalked his hands and monstrous back,
  • Said, "Boy, stop lying and don't say you've forgotten!
  • Trouble with you is you ain't been SQUATTIN'!"
  • </cfsavecontent>
  •  
  • <!---
  • Now, we are going to repeat the string a number of
  • times just to make a really big string.
  • --->
  • <cfset strText = RepeatString(
  • REReplace( strText, "[,!'"".-]+", "", "ALL" ),
  • 20
  • ) />

Notice that I am repeating that string 20 times. This should make it a good size. I am also stripping out all the junk characters so that I don't have to deal with them.

Now, I need to get the words to replace. We are going to replace every word in the passage. So, you get all the words, we are going to treat the passage as a list and convert it to an array using several different list delimiters:

  • <!---
  • Let's set up an array of words that we want to find
  • and replace with case insensitivity. Let's use every
  • single word in the passage as a word to replace
  • --->
  • <cfset arrWords = ListToArray(
  • strText,
  • " ,'!""-#Chr( 13 )##Chr( 10 )#"
  • ) />

Now, I am going to test the speed of the REFindNoCase() and the REReplaceNoCase() methods. You will notice that in my Replace method, I am only replacing ONE match at time. This is only done so that the replace will take longer and we will be more likely to see a difference in speed.

  • <!---
  • Now, let's get a copy of the passage for the first round
  • of testing. This will test the case insensitive search
  • using the build in ColdFusion replace function.
  • --->
  • <cfset strTargetText = strText />
  •  
  • <!--- Set up the timer. --->
  • <cftimer label="ColdFusion REReplaceNoCase" type="outline">
  •  
  • <!--- Loop over the words to replace. --->
  • <cfloop
  • index="intI"
  • from="1"
  • to="#ArrayLen( arrWords )#"
  • step="1">
  •  
  • <!---
  • Keep looping while there is still a reference to
  • this word. We are only going to replace one at a
  • time to make it slower.
  • --->
  • <cfloop condition="REFindNoCase( '\b#UCase( arrWords[ intI ] )#\b', strTargetText )">
  •  
  • <!---
  • Replce the word with an empty space. When using
  • the word, convert it to upper case just to make
  • sure we are doing case-insensitive.
  • --->
  • <cfset strTargetText = REReplaceNoCase(
  • strTargetText,
  • "\b#UCase( arrWords[ intI ] )#\b",
  • "",
  • "ONE"
  • ) />
  •  
  • </cfloop>
  •  
  • </cfloop>
  •  
  • </cftimer>

This ran on average between 1,437 ms.

Now, let's test the REFind() and REReplace() methods. Notice that we are doing the exact same thing, the only difference that we are using the case insensitive flag (?i) instead of the NoCase methods:

  • <!---
  • Now, let's get a copy of the passage for the next round
  • of testing. This will test the case insensitive search
  • using the case insensitive flag with the build in Cold
  • Fusion case sensitive search function.
  • --->
  • <cfset strTargetText = strText />
  •  
  • <!--- Set up the timer. --->
  • <cftimer label="ColdFusion REReplace" type="outline">
  •  
  • <!--- Loop over the words to replace. --->
  • <cfloop
  • index="intI"
  • from="1"
  • to="#ArrayLen( arrWords )#"
  • step="1">
  •  
  • <!---
  • Keep looping while there is still a reference to
  • this word. We are only going to replace one at a
  • time to make it slower.
  • --->
  • <cfloop condition="REFind( '(?i)\b#UCase( arrWords[ intI ] )#\b', strTargetText )">
  •  
  • <!---
  • Replce the word with an empty space. When using
  • the word, convert it to upper case just to make
  • sure we are doing case-insensitive.
  • --->
  • <cfset strTargetText = REReplace(
  • strTargetText,
  • "(?i)\b#UCase( arrWords[ intI ] )#\b",
  • "",
  • "ONE"
  • ) />
  •  
  • </cfloop>
  •  
  • </cfloop>
  •  
  • </cftimer>

This ran on average of about 1,344 ms.

So, there seemed to be a slight speed advantage of using the case insensitive flag over the NoCase methods. However, we had to run a really inefficient test to see any difference. And, the testing was not always consistent. The results above are what were trended, but they were not always consistent. Sometimes the NoCase methods were faster, but on average they were just a bit slower.



Reader Comments

Oct 16, 2006 at 10:02 AM // reply »
164 Comments

Just remember two things:

Iteration testing isn't the best test of actual performance--as there's lots of things that can affect performance w/in a loop (such as garbage collection, other processes, etc.)

When you're talking about miniscule differences, don't forget about readability/complexity of code.

I know most of your tests are for curiosity's sake, but one problem we developers get into from time to time is going overboard in trying to squeeze 10ms out of a template only to end up making our code harder to read and maintain.


Oct 16, 2006 at 10:07 AM // reply »
10,743 Comments

Dan,

I am 100% in agreement with what you are saying. If I don't see any significant difference (which I am not seeing in this example), I opt for which ever one is the most readable / maintainable. In fact, I do so much of RegEx directly in the Java string itself, I don't have the option of REReplaceNoCase() anyway, in which case, I need the (?i) flag.

So yeah, this is all just for explorations sake. I try to not write about any implications of one thing or another for the very reason you are talking about. So much goes into affecting performance. I just state the facts of the finding, not the "what does that mean for you" type stuff.


Post A Comment

Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.

Please review the following issues:

Author Name:


Author Email:

Author Website:

Comment:

Supported HTML tags for formatting: <strong>bold</strong>   <em>italic</em>   <code>code</code>







  • Help Wanted - Find Your Next ColdFusion Job
InVision App - Prototyping Made Beautiful With Prototyping Tools Ben Nadel's Company - Epicenter Consulting Recent Blog Comments
May 21, 2012 at 1:58 AM
Updated: Converting A ColdFusion Query To CSV Using QueryToCSV()
Hi Ben, why do you need to have so many double quotes when adding the field and field name to the row data? ----------------------------------------- <cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = ... read »
AXL
May 21, 2012 at 1:24 AM
URL Rewriting And ColdFusion's WriteToBrowser Image Functionality (CFFileServlet)
@Mounir, Open your lower case URL Rewrite rule and add the following condition. Condition input: {REQUEST_URI} Check if input string: Does Not Match the Pattern Pattern: ^/CFFileServlet/_cf_ca ... read »
May 20, 2012 at 4:28 AM
Understanding The Complex And Circular Relationships Between Objects In JavaScript
@Will Vaughn I tried your javascript example but got this error:- foo.print is not a function ... read »
May 19, 2012 at 5:37 AM
A Graphical Explanation Of Javascript Closures In A jQuery Context
Thanks for this article, but I fear you missed an important point. If variables in the outer context change, these changes affect the inner anonymous functions as well. That means: if you change the ... read »
May 18, 2012 at 3:39 PM
Parsing CSV Data With An Input Stream And A Finite State Machine
Can you use file upload button with this? and read live? or does the file have to already be on the server saved? ... read »
May 18, 2012 at 1:06 AM
VIRGO (Aug. 23-Sept. 22): Dead On The Money!
A friend of mine and I were arguing about astrology and she told me that he believes in astrology. She hasn't provided me with any evidence that the belief makes any sense to me. She she been telling ... read »
May 17, 2012 at 11:32 PM
Using ColdFusion to Handle 404 Errors (Page Not Found) On Development Server
Very easy the configuration. I read a lot pages and I can't find the solution. I open the administrator and change this Administrator/server settings/Error Handlers/Missing Template Handler and p ... read »
May 17, 2012 at 3:13 PM
LOCAL Variables Scope Conflicts With ColdFusion Query of Queries
I never cease to be amazed that almost EVERY random CF issue I come across lands me on your site. Thank you for documenting your findings for the world. ... read »