Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at cf.Objective() 2009 (Minneapolis, MN) with:

RELoop ColdFusion Custom Tag Case Study

By Ben Nadel on
Tags: ColdFusion

A while back, I created a ColdFusion custom tag that looped over text using regular expression patterns rather than list delimiters. For each iteration, it returned a variable that contained either the matched string or the matched groups (depending on utilized tag attributes). These values could then be modified and stored back into the original string. The whole point of the ColdFusion custom tag was to mimic the powerful use of nameless functions in Javascript's string replace() method.

There were not many people who were convinced that it was a good idea, but today on the CF-Talk list, I saw what could be a really convenient use case for it. Here is the problem that Josh Nathanson asked about:

Got a regex challenge...I was able to solve it using an REFind and then REReplace, but I'm wondering if anyone can come with a "one-shot" way to replace without looping. I need to remove any carriage returns within a quoted string, but not touch them if they are outside quotes. So: "the quick brown fox \r\n jumps over the \r\n lazy dog" <-- remove the \r\n's My name is mud \r\n <-- leave this one alone I'm sure this is probably easy for the regex gurus...

As it turns out, this is not the easiest kind of regular expression to code when you want to take care of it all in one shot, but it is a task that becomes quite easy when you use my RELoop ColdFusion custom tag. Take a look:

  • <!---
  • Build up our test data. This data will have line
  • breaks inside and outside of quoted values.
  • --->
  • <cfsavecontent variable="strText">
  •  
  • Hey there, here some text that is not quoted
  • that has line breaks in it. Then, here is some
  • "quoted text that also
  • has some line breaks" in it. Of course, not
  • all "quoted text" needs to
  • have "line
  • breaks" in it; that is only going to happend some
  • of the time and we want to be sure not to replace
  • out the line breaks that are NOT "within quoted
  • values".
  •  
  • </cfsavecontent>
  •  
  •  
  • <!---
  • Replacing the line breaks directly in the regular
  • expression is gonna be a huge pain in the butt, so
  • we are gonna do the next best thing - we are gonna
  • find all the quoted values and then act on them
  • individually.
  • --->
  • <cf_reloop
  • index="strValue"
  • text="#strText#"
  • pattern="(""[^""]*"")"
  • variable="strText">
  •  
  • <!---
  • Now that we have a quoted value, just replace the line
  • breaks and carriage returns. This might be overkill some
  • of the time, but it is the easy solution.
  • --->
  • <cfset strValue = strValue.ReplaceAll(
  • JavaCast( "string", "[\r\n]+" ),
  • JavaCast( "string", " " )
  • ) />
  •  
  • </cf_reloop>
  •  
  •  
  • <!---
  • When we output the new text, we are going to replace the
  • newlines / carriage returns with <br /> tags so that we
  • can see where the line breaks exist in an HTML context.
  • --->
  • <p>
  • #strText.ReplaceAll(
  • JavaCast( "string", "\r\n" ),
  • JavaCast( "string", "<br />" )
  • )#
  • </p>

First, I am building up a chunk of text that has line breaks in both the quoted and the non-quoted parts. Then, I am using the RELoop to iterate over all quoted values and within that, it takes just one simple ReplaceAll() method to clear out the line breaks. There is some overkill for this as you are going to run replaces on quoted values that don't have any line breaks; however, I think the time / effort you save on not having an insane regular expression is worth the overhead of some extraneous replace calls. Running the above code we get the following output:

Hey there, here some text that is not quoted
that has line breaks in it. Then, here is some
"quoted text that also has some line breaks" in it. Of course, not
all "quoted text" needs to
have "line breaks" in it; that is only going to happend some
of the time and we want to be sure not to replace
out the line breaks that are NOT "within quoted values".

It's almost too easy. In my gut, I really feel like this kind of a custom tag is useful, but maybe it's just for a few cases.

Tweet This Groovy post by @BenNadel - RELoop ColdFusion Custom Tag Case Study Thanks my man — you rock the party that rocks the body!



Reader Comments

Hey Ben,

Haven't played with your solution to test it, but it appears that you're not taking into account escaped quotes within a quoted string, so for something like the following string:

She said: "If you want to quote something add a "" symbol to start of \r\n
the string,\r\n
and a "" to the end\r\n
of the text you're\r\n
working with."

Just made that up so there is probably a better example out there, but it does seem that the RegExp you're using only matches pairs of quotes, not matching pairs of quotes.

Again, was just thinking about the escaped quote issue and haven't tested it to see if your solution accommodates them already.

Reply to this Comment

@Danilo,

My current regular expression does not take into account any concept of escaping. It's tough to do because depending on the context of the problem, escaping means different things. If you looking at CSV (comma separated values) data, the an escaped quote would be "". If you were looking at Javascript data, an escaped quote might look like \".

You can write regular expressions that take into account escaped quotes, and this can be done using what Steve Levithan showed me was called "unrolling the loop":

http://www.bennadel.com/index.cfm?dax=blog:978.view

But again, you can only do this when you know the way escaping is done and the context of the problem.

Reply to this Comment

I think this is very useful. In fact, I have run across several problems for which this would have been very useful. Next time I run into one, I will definitely have to download this and try it out.

I really like that it is a custom tag. This allows me to take any type or number of actions within the loop.

What version of CF does it require?

Reply to this Comment

@Steve,

I have tested it in ColdFusion MX7. I uses the Java Pattern object behind the scenes (for more powerful and easier iteration), so it needs to be MX of some sort. I haven't tested on MX6, but as long as it can use CreateObject( "java" ) then it should be fine.

Also, you have option to return a struct of data rather than a simple string, which contains the indexed-group matching, the group count, and the character offset of the match. I tried to make it as flexible as possible. Glad you might find it useful.

Reply to this Comment

@Ben,

Thanks for the link. I was also thinking about the different escape chars for different languages/context, puls what happens when you combine the two, like a CF string with escaped string that has an embedded JavaScript code sample. The mind just goes a little wonky.

Thanks for the link, I'll take a look at it if/when I need such RegExp operations.

Reply to this Comment

Post A Comment

?
You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.