Ask Ben: Using Regular Expressions To Parse Data In ColdFusion

Posted February 15, 2009 at 4:33 PM by Ben Nadel

Tags: ColdFusion, Ask Ben

I really like how you used Javascript's String Replace method last week to parse data with regular expressions. This seems like a really powerful tool. Is there any way to do the same thing in ColdFusion?

In Javascript, functionality like this is made possible, in part, because in Javascript, you can define anonymous, inline functions to be executed for each pattern match in the String replace method. In ColdFusion, you cannot define inline methods in this manner. You can, however, use things like RELoop, my ColdFusion custom tag for looping over regular expression patterns, to accomplish a similar functionality.

Since it's been a while since I've mentioned my RELoop tag, I'll give you a quick summary. RELoop is a ColdFusion custom tag that loops over regular expression pattern matches in a target string and makes each subsequent pattern match available for a given iteration of the loop. The match can be returned as a simple string or as a collection of captured groups. These can be treated individually or modified and stored back into the target string resulting in a new text value.

Here is a summary of the tag attributes:

Index

This is the CALLER-scoped variable into which we will store the contextual match. This may be a string or a struct depending on whether the user wants the Groups returned (as defined by the ReturnGroups attribute below).

Text

This is the target text in which we will be searching for and iterating over the regular expression pattern matches.

Pattern

This is the regular expression pattern that we will be matching. Since we are using the Java regular expression engine, this goes by the java.util.regex.Pattern syntax, not necessarily by the ColdFusion regex syntax (there are slight differences in the way these two engines operate - the Java regular expression engine is faster, more powerful, and more robust).

Variable

This is the optional, CALLER-scoped variable into which the resulting string will be placed. This uses the Pattern Matcher's AppendReplacement() and AppendTail() methods to build up a new text value. It replaces the current match with whatever you have stored in the Index variable at the end of each pattern match iteration. Note: If no Variable attribute is defined, the internal algorithm does not waste time creating a new string value.

ReturnGroups

This flags whether to return the single matched pattern or to return a structure with the set of captured groups. If a structure is returned, it stores the entire match in the "0" key. It then stores each captured group in the group-based index key. Additionally, it returns the number of captured groups in the GroupCount key.

Ok, now that we are up to speed on my RELoop ColdFusion custom tag, let's duplicate last week's Javascript demo in ColdFusion:

  • <!--- Store our test data. --->
  • <cfset strData = "[event=action][id=longuuid][a=b][c=D]" />
  •  
  • <!--- Create our data collection of name-value pairs. --->
  • <cfset objData = {} />
  •  
  •  
  • <!---
  • Using our regular expression loop ColdFustion custom tag,
  • loop over the matches, grapping the captured groups on each
  • pattern iteration.
  • --->
  • <cf_reloop
  • index="objMatch"
  • text="#strData#"
  • pattern="\[(\w+)=([^\]]*)\]"
  • returngroups="true">
  •  
  • <!---
  • Store the name-value pair into our collection using
  • the first (name) and second (value) captured groups from
  • the pattern.
  • --->
  • <cfset objData[ objMatch[ 1 ] ] = objMatch[ 2 ] />
  •  
  • </cf_reloop>
  •  
  •  
  • <!--- Output our name-value collection. --->
  • <cfdump
  • var="#objData#"
  • label="Name-Value Collection"
  • />

Just as in the Javascript version of this algorithm, I am looping over each regular expression pattern match. Then, for each match, I am simply storing the name-value pair, as defined by our two matched groups, into our data collection. When we run the above code and output the final data struct, we get:

 
 
 
 
 
 
RELoop Data Structure Created By Storing Captured Groups. 
 
 
 

As you can see, the data string was successfully parsed into a collection of name-value entries using regular expression groups and my RELoop ColdFusion custom tag.




Reader Comments

Feb 15, 2009 at 4:47 PM // reply »
12 Comments

That's Exactly what I was looking for, thanks Ben!
Sick tag too, first time I've seen it


Feb 15, 2009 at 5:01 PM // reply »
10,640 Comments

@Don,

Awesome man, glad you like it. I love regular expressions and looping over the pattern matches is definitely something that needs to be more built into the ColdFusion feature set. I have used this tag a bunch of times and really really like its functionality.


Post A Comment

Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.

Please review the following issues:

Author Name:


Author Email:

Author Website:

Comment:

Supported HTML tags for formatting: <strong>bold</strong>   <em>italic</em>   <code>code</code>







  • Help Wanted - Find Your Next ColdFusion Job
InVision App - Prototyping Made Beautiful With Prototyping Tools Ben Nadel's Company - Epicenter Consulting Recent Blog Comments
Feb 10, 2012 at 7:21 PM
jQuery AJAX Strips Script Tags And Inserts Them After Parent-Most Elements
Update! Instead of $(eval(options.insertAfter)).after(data['insertData']); I now use: var ajaxNode = document.createElement('span'); var parent = $(eval(options.insertAfter))[0].parentNode; ... read »
Feb 10, 2012 at 6:18 PM
jQuery AJAX Strips Script Tags And Inserts Them After Parent-Most Elements
encountered this same, what I consider, jQuery bug last week. I'm building a site in which I load some content via AJAX. This content contains Linkedin share button placeholders which Linkedin API ne ... read »
Feb 10, 2012 at 11:30 AM
Cross-Origin Resource Sharing (CORS) AJAX Requests Between jQuery And Node.js
After you understand the concepts here, this is an awesome cheatsheet for enabling CORS in just about anything http://enable-cors.org/ ... read »
JM
Feb 10, 2012 at 9:10 AM
My Safari Browser SQLite Database Hello World Example
@Amy, Here is a very good tutorial on how to use JOIN: http://www.sqltutorial.org/sqljoin-innerjoin.aspx ... read »
Feb 10, 2012 at 4:42 AM
Building A Twitter-Inspired RESTful API Architecture In ColdFusion
This is great, very useful Ben. I spotted a small typo in the api.cgm listing: <cfthrow type="Unauthroized" /> Cheers Stefan ... read »
Feb 9, 2012 at 10:35 PM
CFDirectory Filtering Uses Pipe Character For Multiple Filters (Thanks Steve Withington)
I was wondering if there would be a filter you could apply so that you got everything but what you included in the filter. As in show me all docs that are not a .pdf. ... read »
Feb 9, 2012 at 10:29 PM
Learning ColdFusion 9: Application-Specific Data Sources
@Ben, No offence, but if people were really wanting advanced features they would be using a platform like ASP.NET MVC. CFML is so structurally compromised as a tag-based scripting language that ... read »
Feb 9, 2012 at 10:03 PM
Subversion - Cleanup Failed To Process The Following Paths
@Leviaguirre, do you still have problems with this? ... read »