ColdFusion 10 - XmlSearch() And XmlTransform() Now Support XPath 2.0

Posted February 28, 2012 at 8:57 AM by Ben Nadel

Tags: ColdFusion

In today's world, we don't often work with XML; the majority of data exchange is done using JavaScript Object Notation (JSON). Even APIs that support both XML and JSON seem to be dropping XML support in their roadmap (I know this from personal experience). That said, XML is still a data type that will inevitably be a part of our lives for some time. That's why it's actually kind of exciting that ColdFusion 10 now supports XPath 2.0 in the xmlSearch() and xmlTransform() functions.

NOTE: At the time of this writing, ColdFusion 10 was in public beta.

I don't pretend to be an expert on XPath or XSLT (Extensible Stylesheet Language Transformations); so, rather than try to explain the differences between the versions of XPath, I figured I would just demonstrate some of the functionality that is now available in ColdFusion 10. In the following code, I'm creating a simple XML document and then using xmlSearch() to gather various nodes. I try to explain what's going on in the comments.

  • <!---
  • Create an XML document on which to test new XPath 2.0
  • functionality support.
  • --->
  • <cfxml variable="bookData">
  •  
  • <books>
  • <book id="101" rating="4.5">
  • <title>Muscle: Confessions of an Unlikely Bodybuilder</title>
  • <author>Samuel W. Fussell</author>
  • <published>August 1, 1992</published>
  • <isbn>0380717638</isbn>
  • </book>
  • <book id="201" rating="4">
  • <title>The Fountainhead</title>
  • <author>Ayn Rand</author>
  • <published>November 1, 1994</published>
  • <isbn>0452273331</isbn>
  • </book>
  • <book id="301" rating="4.5">
  • <title>It Was On Fire When I Lay Down On It</title>
  • <author>Robert Fulghum</author>
  • <isbn>0804105820</isbn>
  • </book>
  • </books>
  •  
  • </cfxml>
  •  
  •  
  • <!--- Groovy - now let's execute some XML Path queries. --->
  • <cfscript>
  •  
  •  
  • // Get all of the ratings that are greater than or equal to 4.5.
  • results = xmlSearch(
  • bookData,
  • "//book/@rating[ number( . ) > 4.0 ]"
  • );
  •  
  •  
  • // Get the average rating of the reviews.
  • results = xmlSearch(
  • bookData,
  • "avg( //book/@rating )"
  • );
  •  
  •  
  • // Get a compoud result of the Title and Author notdes. Notice
  • // that we can now create divergent results in the SAME path.
  • // We don't need to create two completely different paths.
  • results = xmlSearch(
  • bookData,
  • "//book/( title, author )"
  • );
  •  
  •  
  • // Get all of the book's children EXCEPT for the ISBN number.
  • // XPath 2.0 introduces some intesting operators like "except",
  • // "every", "some", etc.
  • results = xmlSearch(
  • bookData,
  • "//book/( * except isbn )"
  • );
  •  
  •  
  • // XPath 2.0 now uses sequences instead of node-sets which allow
  • // for more interesting data combinations. This only gets the
  • // nodes from one collection that are NOT in the other collection.
  • // We're using inline branching and merging!
  • results = xmlSearch(
  • bookData,
  • "//book/( (title, published) except (isbn, published) )"
  • );
  •  
  •  
  • // Get all of the ISBN numbers that use a 10-digit ISBN. XPath
  • // 2.0 now supports regular exprdssion functions like matches(),
  • // replace(), and tokenize() -- thought it is quicky and a
  • // bit limited in patterns.
  • results = xmlSearch(
  • bookData,
  • "//book/isbn[ matches( text(), '^\d{10}$' ) ]"
  • );
  •  
  •  
  • // Iterate over one collection and map it onto the resultant
  • // collection. We can now iterate inline within a path.
  • results = xmlSearch(
  • bookData,
  • "for $b in (//book) return ( $b/published )"
  • );
  •  
  •  
  • // We can now pass in params into our xmlSeach() calls. Notice
  • // that the key, "title" is quoted - that is because XPATH is
  • // case-sensitive.
  • results = xmlSearch(
  • bookData,
  • "//book/title[ . = $title ]",
  • {
  • "title": "The Fountainhead"
  • }
  • );
  •  
  •  
  • // Get the given book, no matter what the casing. FINALLY, we
  • // can case-insensitive searching in XML :)
  • results = xmlSearch(
  • bookData,
  • "//book[ upper-case( title ) = 'THE FOUNTAINHEAD' ]"
  • );
  •  
  •  
  • // Debug the results.
  • writeDump( results );
  •  
  •  
  • </cfscript>

From what I've read about the functionality in XPath 2.0, the biggest upgrades seem to be the use of sequences over node-sets and the use of inline path branching and logic. At a very practical level, XPath 2.0 simply supports more functions like lower-case() and upper-case() for case-insensitive matching - something many people have asked for in previous versions of ColdFusion.

Oh, and XPath 2.0 now supports Regular Expression matching as well - yeah boyyyyyy!

Well, that's probably about as much excitement as I can squeeze out of searching XML documents in ColdFusion 10. That is, of course, until you realize that ColdFusion 10 can now parse HTML... but more to come on that shortly.


You Might Also Be Interested In:



Reader Comments

Feb 28, 2012 at 9:58 AM // reply »
11,243 Comments

@All,

And here's part of why XML is getting more exciting in ColdFusion 10 - we can now "easily" convert dirty HTML into valid XML documents:

http://www.bennadel.com/blog/2341-ColdFusion-10-Parsing-Dirty-HTML-Into-Valid-XML-Documents.htm

Due to the JAR files that now ship with ColdFusion 10 (ie. TagSoup), we have now have built-in Java classes that facilitate this kind of parsing.


Feb 28, 2012 at 9:59 AM // reply »
50 Comments

It all looked good until you added the part about regular expression support. That really put it over the edge to greatness!


Feb 28, 2012 at 10:04 AM // reply »
11,243 Comments

@Steve,

Heck yeah! Regular expressions are always groovy :) Unfortunately, it looks like the "\b" word-boundary construct is not supported, which I only realized because it was the first thing I tried. They have a slightly different notation for some things, which I haven't gone through yet.

http://www.w3.org/TR/xmlschema-2/#regexs

But, good to know that it's there.


Feb 28, 2012 at 4:05 PM // reply »
50 Comments

That is disappointing omission. Still, I guess some regular expression support is a major improvement over no regular expression support at all.


Post A Comment

Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.

Please review the following issues:

Author Name:


Author Email:

Author Website:

Comment:

Supported HTML tags for formatting: <strong>bold</strong>   <em>italic</em>   <code>code</code>







  • Help Wanted - Find Your Next ColdFusion Job
Ben Nadel's Company - Epicenter Consulting Recent Blog Comments
May 22, 2013 at 5:35 PM
Script Tags, jQuery, And Html(), Text() And Contents()
This is still an issue 2 years later. jQuery is supposed to remediate these cross browser issues, no? I have been unable to find any statement from the jQuery team calling this behavior "by de ... read »
May 22, 2013 at 12:44 PM
Ask Ben: Query Loop Inside CFScript Tags
In cf10, if you call a function that has: local.result = {}; local.result.msg = ""; local.svc = new query(); local.svc.setSQL("SELECT * FROM..."); local.obj = local.svc.exe ... read »
May 22, 2013 at 12:29 PM
Strange Interaction Between DeserializeJson(), ArrayContains(), And Database Values In ColdFusion
@Ben: What version of Java are you using? Also, did you test users.id to see what Java reports as the data type? I wonder if it's not a Java primitive data type, but getting returned as something ... read »
May 22, 2013 at 11:47 AM
Strange Interaction Between DeserializeJson(), ArrayContains(), And Database Values In ColdFusion
@Dana, Awesome - so it looks like this bug was fixed in ColdFusion 10. Thanks so much for double-checking that. ... read »
May 22, 2013 at 11:37 AM
Strange Interaction Between DeserializeJson(), ArrayContains(), And Database Values In ColdFusion
When I c&p and run on cf10, I get: Selected User IDs: 1,4 User 1 selected: YES - YES User 2 selected: NO - NO User 3 selected: NO - NO User 4 selected: YES - YES User 5 selected: NO - ... read »
May 22, 2013 at 11:27 AM
Strange Interaction Between DeserializeJson(), ArrayContains(), And Database Values In ColdFusion
@Tom, Good thought, but no dice. Both of these still exhibit the same behavior: users.id[ users.currentRow ] users[ "id" ][ users.currentRow ] It's just something whacky happening with ... read »
May 22, 2013 at 11:07 AM
Strange Interaction Between DeserializeJson(), ArrayContains(), And Database Values In ColdFusion
Could your problem be that "users.id" is actually an ARRAY, not a single value? Perhaps try it again with "users.id[1]" (I only have CF8 here at work). ... read »
May 22, 2013 at 7:52 AM
Nested Views, Routing, And Deep Linking With AngularJS
Hi, Just a quick thank you. As it happens, for my own purposes, the pending ui-router work being done in native angular is likely the one I'll adopt, but your exploration, code and documentation of ... read »
InVision App - Prototyping Made Beautiful With Prototyping Tools