Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at the New York ColdFusion User Group (Feb. 2009) with: Joakim Marner

REStructFindValue() - Adding Regular Expression Searching To StructFindValue()

By Ben Nadel on
Tags: ColdFusion

In my previous blog post on creating a unified interface for iterating over structs and arrays in ColdFusion, I mentioned that I had been talking to Marc Esher on Twitter. We had been talking about adding regular expression (RegEx) search capabilities to ColdFusion's StructFindValue() function. I had needed the unified iteration tag because the StructFindValue() method can recursively search over both structs and arrays. In order to not duplicate the logic for each type of object (both of which are essentially key-based collections), I built the following method, REStructFindValue(), using the Each.cfm ColdFusion custom tag.

I'm not completely happy with the way that the recursion was built in the following solution; specifically since the method takes an argument that is not meant to be user-provided. As the method recurses through the nested collections, it passes a fourth "hidden argument" with the subsequent method calls in order to keep track of the growing target path. This could have been solved by creating a sister function, but I didn't like that solution either.

Before we get into the code, however, let's take a look at the context. In the following test, I'm creating a structure with nested data collections. Then, I am going to search for values contained within the total collection based on a regular expression pattern rather than an exact value match.

  • <!--- Create a test data structure. --->
  • <cfset myData = {
  • hotGirls = [
  • {
  • name = "Tricia",
  • hair = "Brunette"
  • },
  • {
  • name = "Kim",
  • hair = "Blonde"
  • }
  • ],
  • athleticGirls = [
  • {
  • name = "Tricia",
  • hair = "Brunette"
  • },
  • {
  • name = "Jen",
  • hair = "Black"
  • }
  • ]
  • } />
  •  
  •  
  • <!--- Get all the values that are either brunette OR black. --->
  • <cfset results = reStructFindValue(
  • myData,
  • "brunette|brown|black",
  • "all"
  • ) />
  •  
  • <!--- Dump out the search results. --->
  • <cfdump
  • var="#results#"
  • label="reStructFindValue( 'brunette|brown|black' )"
  • />

As you can see, I am taking my nested structure and searching for values that match the following regular expression:

brunette|brown|black

This will find values that contain the phrases "brunette", "brown", or "black." And, just like ColdFusion's native StructFindValue() method, this returns an array of matches:

 
 
 
 
 
 
REStructFindValue() - Regular Expression Searching With StructFindValue(). 
 
 
 

As you can see, it found the three values in the nested structure that matched the above regular expression. I tried to keep the format of the results as close to that of the StructFindValue() result collection; however, I simplified my "Path" key to always use array notation and never dot notation. Seeing as both arrays and structs can use this (array notation), I felt that the uniformity of the generated path was a good idea.

Now that we see what the new method is doing, let's take a look at the code. Keep in mind that collection iteration used by this method is performed by my Each.cfm ColdFusion custom tag. This is not required, but it does simplify the method greatly.

REStructFindValue( Target, Pattern, Scope )

  • <cffunction
  • name="reStructFindValue"
  • access="public"
  • returntype="array"
  • output="false"
  • hint="I search for patterns within a given ">
  •  
  • <!--- Define arguments. --->
  • <cfargument
  • name="target"
  • type="any"
  • required="true"
  • hint="I am the target struct being searched."
  • />
  •  
  • <cfargument
  • name="pattern"
  • type="string"
  • required="true"
  • hint="I am the pattern being searched."
  • />
  •  
  • <cfargument
  • name="scope"
  • type="string"
  • required="false"
  • default="one"
  • hint="I am the scope of the search: one or all."
  • />
  •  
  • <cfargument
  • name="path"
  • type="string"
  • required="false"
  • default=""
  • hint="The path to the current target (for recursive calling). ** NOTE: This is used internally for recursion - this is NOT an expected argument to be passed in by the user."
  • />
  •  
  • <!--- Define the local scope. --->
  • <cfset var local = {} />
  •  
  • <!--- Create an array --->
  • <cfset local.results = [] />
  •  
  • <!---
  • Loop over target.
  • NOTE: This uses a ColdFusion custom tag that unifies
  • the interface for looping over both structure and
  • arrays.
  • http://www.bennadel.com/go/each-iteration
  • --->
  • <cf_each
  • item="local.item"
  • collection="#arguments.target#">
  •  
  • <!--- Create a variable to store the base path. --->
  • <cfset local.path = arguments.path />
  •  
  • <!--- Add the current key to the path. --->
  • <cfset local.path &= "[ ""#local.item.key#"" ]" />
  •  
  • <!--- Get a handle on the new target. --->
  • <cfset local.target = local.item.value />
  •  
  • <!---
  • Check to see if this new target is a string (or
  • if it is another complex object that we need to
  • iterate over).
  • --->
  • <cfif isSimpleValue( local.target )>
  •  
  • <!---
  • Check it for the pattern match on the target
  • value. For now, we are going to be using
  • ColdFusion's Match() method which means a sub
  • set of regular expression usage. Furthermore,
  • we are going to use NoCASE for each of coding.
  • --->
  • <cfif arrayLen( reMatchNoCase( arguments.pattern, local.target ) )>
  •  
  • <!---
  • The regular expression patther was found at
  • least once in the target value. This is a
  • valid match. Add it to the results.
  • --->
  • <cfset local.result = {
  • key = local.item.key,
  • owner = arguments.target,
  • path = local.path
  • } />
  •  
  • <!--- Add this result to the current results. --->
  • <cfset arrayAppend( local.results, local.result ) />
  •  
  • </cfif>
  •  
  • <!---
  • Make sure this complex nested target is one that
  • we can actually iterate over (all others will be
  • skipped).
  • --->
  • <cfelseif (
  • isStruct( local.target ) ||
  • isArray( local.target )
  • )>
  •  
  • <!---
  • The nested taret is not a simple value. Therefore,
  • we need to perform a depth-first, recusive search
  • of it for our matching pattern.
  • --->
  • <cfset local.childResults = reStructFindValue(
  • local.target,
  • arguments.pattern,
  • arguments.scope,
  • local.path
  • ) />
  •  
  • <!---
  • Add the results from our nested search to the
  • current results collection.
  • --->
  • <cfloop
  • index="local.childResult"
  • array="#local.childResults#">
  •  
  • <!--- Add this result to the current results. --->
  • <cfset arrayAppend( local.results, local.childResult ) />
  •  
  • </cfloop>
  •  
  • </cfif>
  •  
  •  
  • <!---
  • At the end of a single iteration, let's check to see
  • if we were only searching for one target. If we are,
  • AND we found it, we can simply return the single
  • element rather than continuing on with our recursion.
  • --->
  • <cfif (
  • (arguments.scope eq "one") &&
  • arrayLen( local.results )
  • )>
  •  
  • <!---
  • We found at least one item - trim the results
  • set in case the last iteration found more than
  • one.
  • --->
  • <cfset local.trimmedResults = [ local.results[ 1 ] ] />
  •  
  • <!--- Return the trimmed result set. --->
  • <cfreturn local.trimmedResults />
  •  
  • </cfif>
  •  
  • </cf_each>
  •  
  •  
  • <!--- Return the found results. --->
  • <cfreturn local.results />
  • </cffunction>

Notice that the UDF above takes four arguments. As I mentioned above, only the first three are meant to be provided by the user. The fourth argument, "Path," is provided by the method itself to keep track of nesting during recursive calls. The regular expression matching is performed by ColdFusion's REMatchNoCase() tag. This means that the regular expressions used in this UDF are subject to the limitations of the REMatchNoCase() method and cannot make use of some advanced pattern constructs.

To be honest, I've never actually used the StructFindValue() method, so I am not really sure what the best use cases are; that said, I hope that this UDF might come in handy to those that do use it often.




Reader Comments

Too funny. I just submitted two new functions to cflib.org: REStructFindValue() and REStructFindValueNoCase().

One thing to note, your REStructFindValue() implementation searches both arrays and structures. StructFindValue() will iterate through arrays, but will only return results from structures. Not a big deal, just something to be aware of if you are looking to use this in place of StructFindValue().

Reply to this Comment

nice! Your mad coding over the past two days has inspired me to take a crack at that potentially useful "StructVisitor" implementation I was talking about, as well.

I love this kind of practice coding. Thanks Ben!

Reply to this Comment

@Nathan,

Good times :) I hadn't used StructFindValue() before and to be honest, the explanation of the various struct "find" methods confused me a bit. I just tried to deduce what it was doing by running some tests and dumping out the results. So, more than likely (as you are saying) my functionality is not going to be as parallel with the native one.

@Nathan, @Marc,

Out of curiosity, what are the use cases for these kind of methods?

@Marc,

Yeah, this stuff is fun. I can't wait to see what you come up with.

Reply to this Comment

@Dan,

Oh man, you used REFind() in your struct search! Sometimes I feel so retarded :) I used REMatch(), which served no purpose (as the matches weren't be gathered), to see if the target value matched the given regular expression. I should have totally used REFind().

Thanks for the wake up call :)

Reply to this Comment

Yeah Ben, it came out fast, especially with LARGE structures, but delving into the CF Java types was certainly a challenge. Also can't pass an array to mine, as it simply only takes a Vector (Struct). Overall a fun excercise.

Thanks for the inspirational idea!

Reply to this Comment

@DeepDown,

I really like the idea of having an Iterator interface that can be used to iterate over just about anything. That's sort of where I was going with my "Each.cfm" custom tag, but your solution allows for much more extension. Very cool!

Reply to this Comment

Well this has just gone a major way to solving a JSON problem with leading zeros on strings that look like numbers.

Using REStructFindValue() to find all the key/value pairs in a structure that have leading zeros and then loop through the resulting data using a function I found on the Adobe Forums site here: http://forums.adobe.com/message/2101252, I can force all leading zeros data in the structure to remain as leading zero data in the JSON produced by SerializeJSON. :-)

Reply to this Comment

Post A Comment

You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.