Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at cf.Objective() 2011 (Minneapolis, MN) with:

REReplace() + Java + Function Pointers = Freakin' Sexy!

By Ben Nadel on
Tags: ColdFusion

I had blogged a little while ago about how awesome it was in Javascript that the String::replace() method could take a function pointer as its replace-with argument and have each match get thrown to the function for evaluation. Unfortunately, I cannot find a native way to do this in ColdFusion. It seems that the replace-with argument gets evaluated once - there is no per-match evaluation.

This, however IS possible if you leverage the Pattern Matcher in Java. That Pattern Matcher runs through a string and finds every matching substring. So, what I was trying to do was mimic the way Javascript performs this. Like I have demonstrated before, Javascript can take either a string as the replace-with argument OR a function pointer. I wanted to do the same thing:

  • <!--- This takes a string replace-with argument. --->
  • <cfset strText = JREReplace(
  • strText,
  • "([a-z]+)",
  • "--->$1<---",
  • "ALL"
  • ) />
  •  
  • <!--- This takes a function pointer replace-with argument. --->
  • <cfset strText = JREReplace(
  • strText,
  • "([a-z]+)",
  • fnHelperMethod,
  • "ALL"
  • ) />

Notice there that the REReplace method starts with "J". That is because it is performing JAVA replace calls. You have to be careful with Java as it uses a slightly different notation. Groups are reference via the dollar sign "$" and NOT the "\" like ColdFusion regular expression replace does.

Ok, so on to the meat of the demonstration. Let's take a look at the user defined function I wrote. I have broken up the lines a bit to decrease horizontal scrolling (not how it is in the code):

  • <cffunction name="JREReplace" access="public" returntype="string" output="false"
  • hint="This performs Java REReplaces on a string.">
  •  
  • <!--- Define arguments. --->
  • <cfargument name="Text" type="string" required="true" />
  • <cfargument name="Pattern" type="string" required="true" />
  • <cfargument name="Target" type="any" required="true" />
  • <cfargument name="Scope" type="string" required="false" default="ONE" />
  •  
  • <cfscript>
  •  
  • // Define the local scope.
  • var LOCAL = StructNew();
  •  
  • // Check to see if we are using a string replace or a method
  • // helper replace.
  • if (IsSimpleValue( ARGUMENTS.Target )){
  •  
  • // We are doing a standard string replace, so just
  • // use Java's string replacement. Check the scope.
  • if (NOT Compare( ARGUMENTS.Scope, "ALL" )){
  •  
  • // Replace all.
  • return(
  • CreateObject( "java", "java.lang.String" ).Init(
  • ARGUMENTS.Text
  • ).ReplaceAll(
  • ARGUMENTS.Pattern, ARGUMENTS.Target
  • )
  • );
  •  
  • } else {
  •  
  • // Replace one.
  • return(
  • CreateObject( "java", "java.lang.String" ).Init(
  • ARGUMENTS.Text
  • ).ReplaceFirst(
  • ARGUMENTS.Pattern,
  • ARGUMENTS.Target
  • )
  • );
  •  
  • }
  •  
  • } else {
  •  
  • // We are using a function here to replace out the
  • // groups. That means that matches have to be
  • // evaluated and replaced on an individual basis.
  • // Create the java pattern complied to the given regular
  • // expression.
  • LOCAL.Pattern = CreateObject(
  • "java",
  • "java.util.regex.Pattern"
  • ).Compile(
  • ARGUMENTS.Pattern
  • );
  •  
  • // Create the java matcher based on the given text using the
  • // compiled regular expression.
  • LOCAL.Matcher = LOCAL.Pattern.Matcher( ARGUMENTS.Text );
  •  
  • // Create a string buffer to hold the results.
  • LOCAL.Results = CreateObject(
  • "java",
  • "java.lang.StringBuffer"
  • ).Init();
  •  
  • // Loop over the matcher while we still have matches.
  • while ( LOCAL.Matcher.Find() ){
  •  
  • // We are going to build an array of matches.
  • LOCAL.Groups = ArrayNew( 1 );
  • for (
  • LOCAL.GroupIndex = 1 ;
  • LOCAL.GroupIndex LTE LOCAL.Matcher.GroupCount() ;
  • LOCAL.GroupIndex = (LOCAL.GroupIndex + 1)
  • ){
  •  
  • // Add the current group to the array of groups.
  • ArrayAppend(
  • LOCAL.Groups,
  • LOCAL.Matcher.Group( JavaCast( "int", LOCAL.GroupIndex ) )
  • );
  •  
  • }
  •  
  • // Replace the current match. Be sure to get the value by
  • // using the helper function.
  • LOCAL.Matcher.AppendReplacement(
  • LOCAL.Results,
  •  
  • // Call the target function pointer using function notation.
  • ARGUMENTS.Target(
  • LOCAL.Matcher.Group(),
  • LOCAL.Groups
  • )
  • );
  • // Check to see if we need to break out of this.
  • if (NOT Compare( ARGUMENTS.Scope, "ONE" )){
  • break;
  • }
  •  
  • }
  •  
  • // Add what ever is left of the text.
  • LOCAL.Matcher.AppendTail( LOCAL.Results );
  •  
  • // Return the string buffer.
  • return( LOCAL.Results.ToString() );
  • }
  •  
  • </cfscript>
  • </cffunction>

The first thing you should see here is that the argument "Target" is of type "ANY". This is because it has to be able to accept both a string value AND a function pointer. The second thing you should notice here is that the function branches based on the type of Target argument. If it is a simple value (ie. a string), it merely returns the results of standard Java replace() method calls. It only goes through the extra processing of looping through the matches if it HAS to. The third thing you should notice is that when the function pointer gets evaluated (called as a function), it gets passed two arguments: the string that was matched and an array of the groups that matched.

Let's just look at that last thing for a second. Let's say you had the following expression:

  • (<a [^>]+>)(.*?)(</a>)

... and it matched the following text:

  • <a href="http://galleries.sweetteencandy.com/d001/">Hot Girls!</a>

... The first argument passed to your target function would be the entire string. The second argument would be the array of matched groups which would be:

  1. <a href="http://galleries.sweetteencandy.com/d001/">
  2. Hot Girls!
  3. </a>

You have to design your helper function to handle this functionality. Of course, you have the option to ignore the second argument all together and just use the matched string.

Let's look at what the helper function might be:

  • <cffunction name="REReplaceHelper" access="public" returntype="string" output="false"
  • hint="Evaluates the matches in a JREReplace and returns a value.">
  •  
  • <!--- Define arguments. --->
  • <cfargument name="Match" type="string" required="true" />
  • <cfargument name="Groups" type="array" required="false" default="#ArrayNew( 1 )#" />
  •  
  • <cfscript>
  •  
  • // The Groups argument contains an array of groups that
  • // were matched. Each group is denoted by a pair of
  • // () in the matched regular expression pattern.
  • // Check to see what the matched value is. For each
  • // argument, we just returning a different value.
  • switch ( ARGUMENTS.Match ){
  •  
  • case "girl":
  • return( "<b>lady</b>" );
  • break;
  •  
  • case "cute":
  • return( "<b>wicked</b> cute" );
  • break;
  •  
  • case "melons":
  • return( "<b>juicy</b> melons" );
  • break;
  •  
  • // No expected math was found.
  • default:
  • return( "<b>??</b>" );
  • break;
  • }
  •  
  • </cfscript>
  • </cffunction>

Now, let's pull it all together. Let's set up some demo text to play around with:

  • <cfsavecontent variable="strText">
  •  
  • I saw this girl over at the market the other
  • day shopping for melons. I was shopping for
  • melons, not her... That is to say, I was
  • shopping for melons, not shopping for her.
  • Anyway, the was very cute. So cute in fact,
  • that I could not work up the nerve to
  • talk to her. Oh well.
  •  
  • </cfsavecontent>

Let's try calling this in the two optional ways, starting with your standard string replace features:

  • <cfset strNewText = JREReplace(
  • strText,
  • "(girl|cute|melons)",
  • "<b>$1</b>",
  • "ALL"
  • ) />

In this case, we are just bolding the matched string and it gives us:

I saw this girl over at the market the other
day shopping for melons. I was shopping for
melons, not her... That is to say, I was
shopping for melons, not shopping for her.
Anyway, the was very cute. So cute in fact,
that I could not work up the nerve
to talk to her. Oh well.

Now, let's get crazy sexy and call it using our helper function pointer:

  • <cfset strNewText = JREReplace(
  • strText,
  • "(girl|cute|melons)",
  • REReplaceHelper,
  • "ALL"
  • ) />

This passes each match to our helper function and gives us:

I saw this lady over at the market the other
day shopping for juicy melons. I was shopping for
juicy melons, not her... That is to say, I was
shopping for juicy melons, not shopping for her.
Anyway, the was very wicked cute. So wicked cute in fact,
that I could not work up the nerve to
talk to her. Oh well.

How awesome is that? This might take you a minute to fully grasp how powerful this might be. Have you ever tried to create insanely complex regular expressions? Imagine being able to make much more simple regular expressions and let the Helper function do most of the heavy lifting. You have access to the ENTIRE COLDFUSION TAG/FUNCTION LIBRARY. That is power my friends - much more power than regular expressions along will give you.

It's not quite as powerful and flexible as Javascript implementation, but hey, it's pretty darn nifty.



Reader Comments

Hi Ben. I'm using(like you) local(cfset local = StructNew()) structure to initialize variables inside functions. Better practices... But I inserted several variables including(instead you) function parameters. What do you think about? Cheers

@Marco,

You can put anything you want into LOCAL. I don't see the need to insert the function parameters as they already exist in the ARGUMENTS scope. However, it should be fine to do so.

Once again, I needed some CF code and rather than having to figure it out myself, you've already written it, and come up with a custom function to access it. Brilliant! :)
Thanks again.

@Gareth Arch, sometimes, that's the best way! :-) Why re-invent the wheel when it is already so beautifully designed in the first place?