Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at Scotch On The Rock (SOTR) 2010 (London) with: Kevin Roche

Sorting XML Nodes Using ColdFusion And XSLT

By Ben Nadel on
Tags: ColdFusion

This morning, I helped someone figure out how to sort a node set of a given XML document using some child node attributes. I really liked this problem and wanted to see if I could come up with a ColdFusion user defined function (UDF) to make it a bit more generic. So far, this is proving much harder than I would have hoped. Originally, I would have liked to see a method that took arguments like this:

XmlSort( XmlData, TargetNodeXPath, SortXPath ) :: XML

Here, the "TargetNodeXPath" would be the XPath required to select the set of nodes to be sorted. The problem with this is that once we are in an XSLT template that matches a given node, we lose some of the sibling context. Yes, we can go up and down the relative node chain, but we lose the ability (from what I can figure out so far) to sort the current node in relation to its siblings.

As such, the only way I could figure out how to make this generic is to take the XPath to the parent node of the target nodes rather than the XPath to the target nodes themselves. The target nodes are then iterated over using a <xsl:for-each> tag. This further complicates the issue because the "SortXPath" then has to be relative to the context of the <xsl:for-each> tag (the target node set node), not the parent node.

It's not pretty, but it's just my first attempt. Here is my ColdFusion user defined function (UDF) meant to encapsulate this logic:

  • <cffunction
  • name="XmlSort"
  • access="public"
  • returntype="string"
  • output="true"
  • hint="I sort part of an XML documument based on the given XPath and sort characteristics.">
  •  
  • <!--- Define arguments. --->
  • <cfargument
  • name="Xml"
  • type="any"
  • required="true"
  • hint="I am an XML string or ColdFusion XML document."
  • />
  •  
  • <cfargument
  • name="ParentXPath"
  • type="string"
  • required="true"
  • hint="I am the XPath to the PARENT node of the nodes which are targeted for sorting." />
  •  
  • <cfargument
  • name="SortXPath"
  • type="any"
  • required="false"
  • default="text()"
  • hint="I am the XPath value upon which the sort is being conducted. This can be a string or an array (if multiple sorting options are required)."
  • />
  •  
  • <cfargument
  • name="Direction"
  • type="string"
  • required="false"
  • default="ascending"
  • hint="I am the sort direction."
  • />
  •  
  • <cfargument
  • name="DataType"
  • type="string"
  • required="false"
  • default="text"
  • hint="I am the type of data that is being used in the sort (to help sorting)."
  • />
  •  
  • <!--- Define the local scope. --->
  • <cfset var LOCAL = {} />
  •  
  •  
  • <!---
  • Check to see if the given sorting option is a string or
  • an array. If it's a string, then let's convert it to an
  • array so that we can treat it uniformily later on.
  • --->
  • <cfif IsSimpleValue( ARGUMENTS.SortXPath )>
  •  
  • <!---
  • We need to copy this to get around a bug in the way
  • ColdFusion handles implicit array creation involving
  • its own values.
  • --->
  • <cfset LOCAL.SortCopy = ARGUMENTS.SortXPath />
  •  
  • <!--- Convert simple value to an array. --->
  • <cfset ARGUMENTS.SortXPath = [ LOCAL.SortCopy ] />
  •  
  • </cfif>
  •  
  •  
  • <!--- Define the XSL Transofrm data. --->
  • <cfxml variable="LOCAL.Transform">
  •  
  • <!--- Document type declaration. --->
  • <?xml version="1.0" encoding="ISO-8859-1"?>
  •  
  • <xsl:transform
  • version="1.0"
  • xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  •  
  • <!--- Match all generic nodes. --->
  • <xsl:template match="*">
  • <!--- Copy this node (non-deep copy). --->
  • <xsl:copy>
  • <!---
  • Make sure that all attributes are copied
  • over for the current node.
  • --->
  • <xsl:copy-of select="@*" />
  •  
  • <!---
  • Apply templates to all of it's child
  • nodes (so that they can be copied).
  • --->
  • <xsl:apply-templates />
  • </xsl:copy>
  • </xsl:template>
  •  
  •  
  • <!---
  • Match the parent node of the target nodes.
  • From here, we can copy the parent and then
  • control how the child nodes are sorted.
  • --->
  • <xsl:template match="#ARGUMENTS.ParentXPath#">
  •  
  • <!---
  • Copy the current node's top-level values
  • (the tag and it's attributes, but not it's
  • descendents).
  • --->
  • <xsl:copy>
  •  
  • <!---
  • Make sure that all attributes are copied
  • over for the current node.
  • --->
  • <xsl:copy-of select="@*" />
  •  
  • <!--- Loop over the xmlitem nodes. --->
  • <xsl:for-each select="*">
  •  
  • <!--- Output all sorting options. --->
  • <cfloop
  • index="LOCAL.SortXPath"
  • array="#ARGUMENTS.SortXPath#">
  •  
  • <xsl:sort
  • select="#LOCAL.SortXPath#"
  • data-type="text"
  • order="#ARGUMENTS.Direction#"
  • />
  •  
  • </cfloop>
  •  
  • <!---
  • Copy the entire node (include its
  • descendantas).
  • --->
  • <xsl:copy-of select="." />
  •  
  • </xsl:for-each>
  •  
  • </xsl:copy>
  •  
  • </xsl:template>
  •  
  • </xsl:transform>
  •  
  • </cfxml>
  •  
  •  
  • <!--- Return the tranformation. --->
  • <cfreturn XmlTransform(
  • ARGUMENTS.Xml,
  • LOCAL.Transform
  • ) />
  • </cffunction>

You will notice that the "SortXPath" argument can be of type any. This is because it can be a string, for single-value sorting, or it can be an array of strings, for use with multi-value sorting.

To test this ColdFusion UDF, let's run a little demo:

  • <!--- Define the XML data. --->
  • <cfxml variable="xmlData">
  •  
  • <data>
  • <boys />
  • <girls>
  • <girl>
  • <firstname>Courtney</firstname>
  • <lastname>Cox</lastname>
  • </girl>
  • <girl>
  • <firstname>Sharon</firstname>
  • <lastname>Stone</lastname>
  • </girl>
  • <girl>
  • <firstname>Christina</firstname>
  • <lastname>Cox</lastname>
  • </girl>
  • <girl>
  • <firstname>Frances</firstname>
  • <lastname>McDormand</lastname>
  • </girl>
  • </girls>
  • </data>
  •  
  • </cfxml>
  •  
  •  
  • <!--- Sort the XML. --->
  • <cfset xmlData = XmlSort(
  • xmlData,
  • "//girls",
  • "lastname/text()"
  • ) />
  •  
  • <!--- Output the transformation. --->
  • #HTMLEditFormat( xmlData )#

When we run this, we get the following output:

  • <?xml version="1.0" encoding="UTF-8"?>
  • <data>
  • <boys/>
  • <girls>
  • <girl>
  • <firstname>Courtney</firstname>
  • <lastname>Cox</lastname>
  • </girl>
  • <girl>
  • <firstname>Christina</firstname>
  • <lastname>Cox</lastname>
  • </girl>
  • <girl>
  • <firstname>Frances</firstname>
  • <lastname>McDormand</lastname>
  • </girl>
  • <girl>
  • <firstname>Sharon</firstname>
  • <lastname>Stone</lastname>
  • </girl>
  • </girls>
  • </data>

Notice that the girl nodes have sorted properly on last name; however, the two Cox girls are not is ascending order by first name. To accommodate for this, we can pass in more than one sort option. Rather than passing in a single string, we will now pass in an array of XPath sort selects:

  • <!--- Create a list of sorting options. --->
  • <cfset arrSorting = [
  • "lastname/text()",
  • "firstname/text()"
  • ] />
  •  
  • <!--- Sort the XML. --->
  • <cfset xmlData = XmlSort(
  • xmlData,
  • "//girls",
  • arrSorting
  • ) />
  •  
  • <!--- Output the transformation. --->
  • #HTMLEditFormat( xmlData )#

Here, we are asking the sort to be done by lastname and then firstname. When we run this, we get the following output:

  • <?xml version="1.0" encoding="UTF-8"?>
  • <data>
  • <boys/>
  • <girls>
  • <girl>
  • <firstname>Christina</firstname>
  • <lastname>Cox</lastname>
  • </girl>
  • <girl>
  • <firstname>Courtney</firstname>
  • <lastname>Cox</lastname>
  • </girl>
  • <girl>
  • <firstname>Frances</firstname>
  • <lastname>McDormand</lastname>
  • </girl>
  • <girl>
  • <firstname>Sharon</firstname>
  • <lastname>Stone</lastname>
  • </girl>
  • </girls>
  • </data>

As you can see now, the Cox girls have been sorted appropriately by both their last and first names.

I am not happy with this solution as I think the two XPath values for parent node and sort select feeling unnatural and disjoint. I would like to figure out a way to get the target set XPath and then set-node-relative sorting XPath. Hopefully more to come on this unless I am totally stumped.




Reader Comments

Oh siblings... I believe there is actually a next-sibling:: and previous-sibling:: in XPath (again, I haven't used them at all), although it's kind of odd that there would be because the specs for XML and XSL say that node sets are naturally unordered, so there wouldn't be any "next" or "previous" by default. But then I find XML philosophy in general somewhat flakey, like the fact that the introduction of XSD eliminated the ability to do certain things that could be done with DTDs (that are actually useful to do, so, unlike the way Java eliminated incrementing memory addresses with ++). Or that XML is supposed to be case sensitive (stupid, stupid, stupid) which kind of contradicts the philosophical objectives of XML.

And I don't think siblings will help you with sorting at all... which honestly I think should be the case.

You left out the <xsl:copy-of select="@*" /> inside your <xsl:template match="*"> -- otherwise you're gonna drop all the attributes in the packet.

@Ike,

Thank you very much for that tip! <xsl:copy-of> will automatically copy all of the attributes, but <xsl:copy> does not. I added the additional <xsl:copy-of select="@*" /> to copy all attributes inside of my two <xsl:copy> tags.

Rock on man :)

You're very welcome. I have a lot of sheets that need to copy everything with discrete changes, so I use that technique a lot. :)

Specifically the difference between xsl:copy and xsl:copy-of is that xsl:copy-of means "and all subnodes exactly as they are now".

So copy-of is an easy way of targeting and pushing forward things you know aren't going to change. Although in the case of attributes, you can override copied attributes by following the xsl:copy-of with an xsl:attribute that overwrites one that was copied. And the select attribute lets you target whatever you want... So say you wanted to just spit out all the comments in a document, you could just have <xsl:template match="/"><xsl:copy-of select="//comment()" /></xsl:template> and you'd get all your comments spit out.

xsl:copy on the other hand doesn't have the select attribute, so you're limited to copying just the current node. The idea between the two is really that xsl:copy is specifically designed to allow you to modify the attributes and contents of the tag being copied and that's why it doesn't let you select multiple nodes to copy at one time like copy-of does.

@Ike,

Mmmmm, the XSLT is strong with you :) Ike, that's a great explanation. When I was learning about this stuff, I definitely thought it odd that they had two different types of copy. But, now, that makes perfect sense.