Sorting XML Nodes Using ColdFusion And XSLT

Posted November 24, 2008 at 2:32 PM by Ben Nadel

Tags: ColdFusion

This morning, I helped someone figure out how to sort a node set of a given XML document using some child node attributes. I really liked this problem and wanted to see if I could come up with a ColdFusion user defined function (UDF) to make it a bit more generic. So far, this is proving much harder than I would have hoped. Originally, I would have liked to see a method that took arguments like this:

XmlSort( XmlData, TargetNodeXPath, SortXPath ) :: XML

Here, the "TargetNodeXPath" would be the XPath required to select the set of nodes to be sorted. The problem with this is that once we are in an XSLT template that matches a given node, we lose some of the sibling context. Yes, we can go up and down the relative node chain, but we lose the ability (from what I can figure out so far) to sort the current node in relation to its siblings.

As such, the only way I could figure out how to make this generic is to take the XPath to the parent node of the target nodes rather than the XPath to the target nodes themselves. The target nodes are then iterated over using a <xsl:for-each> tag. This further complicates the issue because the "SortXPath" then has to be relative to the context of the <xsl:for-each> tag (the target node set node), not the parent node.

It's not pretty, but it's just my first attempt. Here is my ColdFusion user defined function (UDF) meant to encapsulate this logic:

  • <cffunction
  • name="XmlSort"
  • access="public"
  • returntype="string"
  • output="true"
  • hint="I sort part of an XML documument based on the given XPath and sort characteristics.">
  •  
  • <!--- Define arguments. --->
  • <cfargument
  • name="Xml"
  • type="any"
  • required="true"
  • hint="I am an XML string or ColdFusion XML document."
  • />
  •  
  • <cfargument
  • name="ParentXPath"
  • type="string"
  • required="true"
  • hint="I am the XPath to the PARENT node of the nodes which are targeted for sorting." />
  •  
  • <cfargument
  • name="SortXPath"
  • type="any"
  • required="false"
  • default="text()"
  • hint="I am the XPath value upon which the sort is being conducted. This can be a string or an array (if multiple sorting options are required)."
  • />
  •  
  • <cfargument
  • name="Direction"
  • type="string"
  • required="false"
  • default="ascending"
  • hint="I am the sort direction."
  • />
  •  
  • <cfargument
  • name="DataType"
  • type="string"
  • required="false"
  • default="text"
  • hint="I am the type of data that is being used in the sort (to help sorting)."
  • />
  •  
  • <!--- Define the local scope. --->
  • <cfset var LOCAL = {} />
  •  
  •  
  • <!---
  • Check to see if the given sorting option is a string or
  • an array. If it's a string, then let's convert it to an
  • array so that we can treat it uniformily later on.
  • --->
  • <cfif IsSimpleValue( ARGUMENTS.SortXPath )>
  •  
  • <!---
  • We need to copy this to get around a bug in the way
  • ColdFusion handles implicit array creation involving
  • its own values.
  • --->
  • <cfset LOCAL.SortCopy = ARGUMENTS.SortXPath />
  •  
  • <!--- Convert simple value to an array. --->
  • <cfset ARGUMENTS.SortXPath = [ LOCAL.SortCopy ] />
  •  
  • </cfif>
  •  
  •  
  • <!--- Define the XSL Transofrm data. --->
  • <cfxml variable="LOCAL.Transform">
  •  
  • <!--- Document type declaration. --->
  • <?xml version="1.0" encoding="ISO-8859-1"?>
  •  
  • <xsl:transform
  • version="1.0"
  • xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  •  
  • <!--- Match all generic nodes. --->
  • <xsl:template match="*">
  • <!--- Copy this node (non-deep copy). --->
  • <xsl:copy>
  • <!---
  • Make sure that all attributes are copied
  • over for the current node.
  • --->
  • <xsl:copy-of select="@*" />
  •  
  • <!---
  • Apply templates to all of it's child
  • nodes (so that they can be copied).
  • --->
  • <xsl:apply-templates />
  • </xsl:copy>
  • </xsl:template>
  •  
  •  
  • <!---
  • Match the parent node of the target nodes.
  • From here, we can copy the parent and then
  • control how the child nodes are sorted.
  • --->
  • <xsl:template match="#ARGUMENTS.ParentXPath#">
  •  
  • <!---
  • Copy the current node's top-level values
  • (the tag and it's attributes, but not it's
  • descendents).
  • --->
  • <xsl:copy>
  •  
  • <!---
  • Make sure that all attributes are copied
  • over for the current node.
  • --->
  • <xsl:copy-of select="@*" />
  •  
  • <!--- Loop over the xmlitem nodes. --->
  • <xsl:for-each select="*">
  •  
  • <!--- Output all sorting options. --->
  • <cfloop
  • index="LOCAL.SortXPath"
  • array="#ARGUMENTS.SortXPath#">
  •  
  • <xsl:sort
  • select="#LOCAL.SortXPath#"
  • data-type="text"
  • order="#ARGUMENTS.Direction#"
  • />
  •  
  • </cfloop>
  •  
  • <!---
  • Copy the entire node (include its
  • descendantas).
  • --->
  • <xsl:copy-of select="." />
  •  
  • </xsl:for-each>
  •  
  • </xsl:copy>
  •  
  • </xsl:template>
  •  
  • </xsl:transform>
  •  
  • </cfxml>
  •  
  •  
  • <!--- Return the tranformation. --->
  • <cfreturn XmlTransform(
  • ARGUMENTS.Xml,
  • LOCAL.Transform
  • ) />
  • </cffunction>

You will notice that the "SortXPath" argument can be of type any. This is because it can be a string, for single-value sorting, or it can be an array of strings, for use with multi-value sorting.

To test this ColdFusion UDF, let's run a little demo:

  • <!--- Define the XML data. --->
  • <cfxml variable="xmlData">
  •  
  • <data>
  • <boys />
  • <girls>
  • <girl>
  • <firstname>Courtney</firstname>
  • <lastname>Cox</lastname>
  • </girl>
  • <girl>
  • <firstname>Sharon</firstname>
  • <lastname>Stone</lastname>
  • </girl>
  • <girl>
  • <firstname>Christina</firstname>
  • <lastname>Cox</lastname>
  • </girl>
  • <girl>
  • <firstname>Frances</firstname>
  • <lastname>McDormand</lastname>
  • </girl>
  • </girls>
  • </data>
  •  
  • </cfxml>
  •  
  •  
  • <!--- Sort the XML. --->
  • <cfset xmlData = XmlSort(
  • xmlData,
  • "//girls",
  • "lastname/text()"
  • ) />
  •  
  • <!--- Output the transformation. --->
  • #HTMLEditFormat( xmlData )#

When we run this, we get the following output:

  • <?xml version="1.0" encoding="UTF-8"?>
  • <data>
  • <boys/>
  • <girls>
  • <girl>
  • <firstname>Courtney</firstname>
  • <lastname>Cox</lastname>
  • </girl>
  • <girl>
  • <firstname>Christina</firstname>
  • <lastname>Cox</lastname>
  • </girl>
  • <girl>
  • <firstname>Frances</firstname>
  • <lastname>McDormand</lastname>
  • </girl>
  • <girl>
  • <firstname>Sharon</firstname>
  • <lastname>Stone</lastname>
  • </girl>
  • </girls>
  • </data>

Notice that the girl nodes have sorted properly on last name; however, the two Cox girls are not is ascending order by first name. To accommodate for this, we can pass in more than one sort option. Rather than passing in a single string, we will now pass in an array of XPath sort selects:

  • <!--- Create a list of sorting options. --->
  • <cfset arrSorting = [
  • "lastname/text()",
  • "firstname/text()"
  • ] />
  •  
  • <!--- Sort the XML. --->
  • <cfset xmlData = XmlSort(
  • xmlData,
  • "//girls",
  • arrSorting
  • ) />
  •  
  • <!--- Output the transformation. --->
  • #HTMLEditFormat( xmlData )#

Here, we are asking the sort to be done by lastname and then firstname. When we run this, we get the following output:

  • <?xml version="1.0" encoding="UTF-8"?>
  • <data>
  • <boys/>
  • <girls>
  • <girl>
  • <firstname>Christina</firstname>
  • <lastname>Cox</lastname>
  • </girl>
  • <girl>
  • <firstname>Courtney</firstname>
  • <lastname>Cox</lastname>
  • </girl>
  • <girl>
  • <firstname>Frances</firstname>
  • <lastname>McDormand</lastname>
  • </girl>
  • <girl>
  • <firstname>Sharon</firstname>
  • <lastname>Stone</lastname>
  • </girl>
  • </girls>
  • </data>

As you can see now, the Cox girls have been sorted appropriately by both their last and first names.

I am not happy with this solution as I think the two XPath values for parent node and sort select feeling unnatural and disjoint. I would like to figure out a way to get the target set XPath and then set-node-relative sorting XPath. Hopefully more to come on this unless I am totally stumped.




Reader Comments

ike
Nov 24, 2008 at 3:43 PM // reply »
78 Comments

Oh siblings... I believe there is actually a next-sibling:: and previous-sibling:: in XPath (again, I haven't used them at all), although it's kind of odd that there would be because the specs for XML and XSL say that node sets are naturally unordered, so there wouldn't be any "next" or "previous" by default. But then I find XML philosophy in general somewhat flakey, like the fact that the introduction of XSD eliminated the ability to do certain things that could be done with DTDs (that are actually useful to do, so, unlike the way Java eliminated incrementing memory addresses with ++). Or that XML is supposed to be case sensitive (stupid, stupid, stupid) which kind of contradicts the philosophical objectives of XML.


ike
Nov 24, 2008 at 3:44 PM // reply »
78 Comments

And I don't think siblings will help you with sorting at all... which honestly I think should be the case.


ike
Nov 24, 2008 at 3:51 PM // reply »
78 Comments

You left out the <xsl:copy-of select="@*" /> inside your <xsl:template match="*"> -- otherwise you're gonna drop all the attributes in the packet.


Nov 25, 2008 at 10:26 AM // reply »
10,638 Comments

@Ike,

Thank you very much for that tip! <xsl:copy-of> will automatically copy all of the attributes, but <xsl:copy> does not. I added the additional <xsl:copy-of select="@*" /> to copy all attributes inside of my two <xsl:copy> tags.

Rock on man :)


ike
Nov 25, 2008 at 2:09 PM // reply »
78 Comments

You're very welcome. I have a lot of sheets that need to copy everything with discrete changes, so I use that technique a lot. :)


ike
Nov 25, 2008 at 2:16 PM // reply »
78 Comments

Specifically the difference between xsl:copy and xsl:copy-of is that xsl:copy-of means "and all subnodes exactly as they are now".

So copy-of is an easy way of targeting and pushing forward things you know aren't going to change. Although in the case of attributes, you can override copied attributes by following the xsl:copy-of with an xsl:attribute that overwrites one that was copied. And the select attribute lets you target whatever you want... So say you wanted to just spit out all the comments in a document, you could just have <xsl:template match="/"><xsl:copy-of select="//comment()" /></xsl:template> and you'd get all your comments spit out.

xsl:copy on the other hand doesn't have the select attribute, so you're limited to copying just the current node. The idea between the two is really that xsl:copy is specifically designed to allow you to modify the attributes and contents of the tag being copied and that's why it doesn't let you select multiple nodes to copy at one time like copy-of does.


Nov 28, 2008 at 5:25 PM // reply »
10,638 Comments

@Ike,

Mmmmm, the XSLT is strong with you :) Ike, that's a great explanation. When I was learning about this stuff, I definitely thought it odd that they had two different types of copy. But, now, that makes perfect sense.


Post A Comment

Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.

Please review the following issues:

Author Name:


Author Email:

Author Website:

Comment:

Supported HTML tags for formatting: <strong>bold</strong>   <em>italic</em>   <code>code</code>







  • Help Wanted - Find Your Next ColdFusion Job
InVision App - Prototyping Made Beautiful With Prototyping Tools Ben Nadel's Company - Epicenter Consulting Recent Blog Comments
Feb 3, 2012 at 10:49 PM
How I Got Node.js Running On A Linux Micro Instance Using Amazon EC2
Wow this was really helpful! Only thing I would add is you need to update your .bash_profile after you edit the secure_path. This is what I did: $ . ~/.bash_profile Otherwise, NPM won't be found. ... read »
Feb 3, 2012 at 10:14 PM
Pushing Base64-Encoded Images Over HTML5 WebSockets With Pusher And ColdFusion
@Ben, Just wanted to let you know that pusher are soon to start limiting sizes on messages. This was the detail that came through in the Feb dispatch: "However, we will soon be limiting the s ... read »
Feb 3, 2012 at 5:05 PM
Regular Expressions Make CSV Parsing In ColdFusion So Much Easier (And Faster)
I tried using your RegEx in my C# program, but it was matching an extra empty-string at the end and so I would end up with an extra field that doesn't exist, so I changed it to this: (^|,)("(?: ... read »
Feb 3, 2012 at 3:47 PM
ColdFusion Supports HTTP Verbs PUT And DELETE (As Well As GET And POST)
Josh Cyr posted this on Twitter just a little bit ago. Thought it was appropriate. http://stackoverflow.com/questions/1619152/how-to-create-rest-urls-without-verbs/1619677#1619677 ... read »
Feb 3, 2012 at 2:28 PM
Changing The Execution Context Of Your Self-Executing Function Blocks In JavaScript
@Michael, You definitely make a good point (and extra points for quoting movies - I love movies). When you use a return() statement to define the object's public API, it does provide a consistent a ... read »
Feb 3, 2012 at 2:04 PM
Changing The Execution Context Of Your Self-Executing Function Blocks In JavaScript
To quote Jurassic Park: "Just because you can doesn't mean you should". I completely, utterly disagree with the thought that this is more readable. Consider the current module pattern: if ... read »
Feb 3, 2012 at 1:10 PM
REST API Design Rulebook By Mark Masse
@Jordan, Yeah, WRML was created by Mark Masse (author of the book). I also found it to be a bit convoluted. I suppose it is intended to allow the Client to be able to programmaticaly respond to cha ... read »
Feb 3, 2012 at 1:08 PM
ColdFusion Supports HTTP Verbs PUT And DELETE (As Well As GET And POST)
@Jason, To be honest, I don't have good answers for that kinds of stuff. And, to the point, that is specifically why I *really* liked the REST API Design Rulebook by Mark Masse - he just cuts throu ... read »