Update: Sorting Target XML Node Sets In ColdFusion XML Documents
Ok, so last week, I was having some issues sorting target XML node sets in ColdFusion XML documents. I created a user defined function that would sort nodes given a parent node, but that never sat well with me; unfortunately, I didn't know how to fix it so that a target node set could be given rather than a parent node reference. Then, yesterday, Adam Cameron and Justin Carter suggested that if I use Duplicate() when copying XML node references, I won't create pointer-conflicts.
With this new information in hand, I re-tooled my XmlSort() ColdFusion user defined function to work using the XPath of a target node set rather than a parent node reference:
<!--- Create a ColdFusion XML document. ---> | |
<cfxml variable="xmlData"> | |
<toes> | |
<toe id="1"> | |
<cuteness>5</cuteness> | |
</toe> | |
<toe id="2"> | |
<cuteness>8</cuteness> | |
</toe> | |
<toe id="3"> | |
<cuteness>2</cuteness> | |
</toe> | |
<toe id="4"> | |
<cuteness>8</cuteness> | |
</toe> | |
<toe id="5"> | |
<cuteness>10</cuteness> | |
</toe> | |
</toes> | |
</cfxml> | |
<!--- Create sorting values. ---> | |
<cfset arrSorting = [ | |
"cuteness", | |
"@id" | |
] /> | |
<!--- Sort TOE nodes with cuteness descending. ---> | |
<cfset xmlData = XmlSort2( | |
xmlData, | |
"//toe", | |
arrSorting, | |
"DESC" | |
) /> | |
<!--- Dump out resultant XML. ---> | |
<cfdump | |
var="#xmlData#" | |
label="Xml In Cutness DESC Order." | |
/> |
Notice here that I want to sort the "toe" nodes; so, rather than passing it a "toes" (parent node) XPath, I pass in the reference to the actual target node set:
"//toe"
Then, the sorting paths are relative to each node of the target node set:
<!--- Create sorting values. ---> | |
<cfset arrSorting = [ | |
"cuteness", | |
"@id" | |
] /> |
This makes my very happy since now the target XPath and the sorting XPath are each relative to the target node set and NOT to the parent node. This is the kind of uniformity that I wanted in the original XmlSort() attempt but was not smart enough to figure out.
When I run the above code, I get the following output:

As you can see, the XML nodes are sorted in descending order according to the XmlText of the "cuteness" node. Notice also that in the case where the two toes have the same cuteness, the deciding factor is the second sort option, the ID attribute of the toe node.
Sweet, it worked! Ok, it worked, but the UDF for this is not exactly pretty; it's long and convoluted, but it gets the job done:
<cffunction | |
name="XmlSort2" | |
access="public" | |
returntype="xml" | |
output="true" | |
hint="I sort part of an XML documument based on the given XPath and sort characteristics."> | |
<!--- Define arguments. ---> | |
<cfargument | |
name="Xml" | |
type="any" | |
required="true" | |
hint="I am an XML string or ColdFusion XML document." | |
/> | |
<cfargument | |
name="XPath" | |
type="string" | |
required="true" | |
hint="I am the XPath to the target node set." /> | |
<cfargument | |
name="SortXPath" | |
type="any" | |
required="false" | |
default="text()" | |
hint="I am the XPath value upon which the sort is being conducted. This can be a string or an array (if multiple sorting options are required)." | |
/> | |
<cfargument | |
name="Direction" | |
type="string" | |
required="false" | |
default="ASC" | |
hint="I am the sort direction. ASC or DESC." | |
/> | |
<!--- Define the local scope. ---> | |
<cfset var LOCAL = {} /> | |
<!--- | |
Check to see if our XML document is an Xml document. | |
If is not, then we want to convert it into a true | |
ColdFusion XML document. | |
---> | |
<cfif NOT IsXmlDoc( ARGUMENTS.Xml )> | |
<!--- Convert to an XML document. ---> | |
<cfset ARGUMENTS.Xml = XmlParse( | |
Trim( ARGUMENTS.Xml ) | |
) /> | |
</cfif> | |
<!--- | |
Check to see if the given sorting option is a string or | |
an array. If it's a string, then let's convert it to an | |
array so that we can treat it uniformily later on. | |
---> | |
<cfif IsSimpleValue( ARGUMENTS.SortXPath )> | |
<!--- | |
We need to copy this to get around a bug in the way | |
ColdFusion handles implicit array creation involving | |
its own values. | |
---> | |
<cfset LOCAL.SortCopy = ARGUMENTS.SortXPath /> | |
<!--- Convert simple value to an array. ---> | |
<cfset ARGUMENTS.SortXPath = [ LOCAL.SortCopy ] /> | |
</cfif> | |
<!--- Get the set of target nodes. ---> | |
<cfset LOCAL.TargetNodes = XmlSearch( | |
ARGUMENTS.Xml, | |
ARGUMENTS.XPath | |
) /> | |
<!--- | |
Check to make sure we have target nodes. If not, | |
then just return the original XML. | |
---> | |
<cfif (ArrayLen( LOCAL.TargetNodes ) LT 2)> | |
<cfreturn ARGUMENTS.Xml /> | |
</cfif> | |
<!--- | |
ASSERT: At this point, we know that we have a valid | |
set of target nodes that can be sorted. | |
---> | |
<!--- Perform bubble sort on target nodes. ---> | |
<cfloop | |
index="LOCAL.EndIndex" | |
from="#(ArrayLen( LOCAL.TargetNodes ) - 1)#" | |
to="1" | |
step="-1"> | |
<!--- | |
Loop over nodes starting at 1 and then proceeding | |
until we reach the "end index". This way, we will | |
not duplicate our comparison of the last value. | |
---> | |
<cfloop | |
index="LOCAL.Index" | |
from="1" | |
to="#LOCAL.EndIndex#" | |
step="1"> | |
<!--- Get the two target nodes. ---> | |
<cfset LOCAL.NodeOne = LOCAL.TargetNodes[ LOCAL.Index ] /> | |
<cfset LOCAL.NodeTwo = LOCAL.TargetNodes[ LOCAL.Index + 1 ] /> | |
<!--- | |
Now that we have the two target nodes, we have | |
to sort them based on the array of comparison | |
values. We only need to proceed through the | |
comparisons if the previous comparison is equal. | |
---> | |
<cfloop | |
index="LOCAL.SortXPath" | |
array="#ARGUMENTS.SortXPath#"> | |
<!--- Get the comparison value for given node. ---> | |
<cfset LOCAL.ValueOne = XmlSearch( | |
LOCAL.NodeOne, | |
LOCAL.SortXPath | |
) /> | |
<!--- Get the comparison value for next node. ---> | |
<cfset LOCAL.ValueTwo = XmlSearch( | |
LOCAL.NodeTwo, | |
LOCAL.SortXPath | |
) /> | |
<!--- | |
Now, we have to get our values down to | |
something that is usable. If this is a node, | |
then get the text value. If this is an | |
attribute then get the attribute value. | |
---> | |
<cfif StructKeyExists( | |
LOCAL.ValueOne[ 1 ], | |
"XmlText" | |
)> | |
<!--- Get node text. ---> | |
<cfset LOCAL.ValueOne = LOCAL.ValueOne[ 1 ].XmlText /> | |
<cfset LOCAL.ValueTwo = LOCAL.ValueTwo[ 1 ].XmlText /> | |
<cfelse> | |
<!--- Get attribute value. ---> | |
<cfset LOCAL.ValueOne = LOCAL.ValueOne[ 1 ].XmlValue /> | |
<cfset LOCAL.ValueTwo = LOCAL.ValueTwo[ 1 ].XmlValue /> | |
</cfif> | |
<!--- | |
Check to see if these two values are equal. | |
If they are, then we just want to proceed | |
to the next sorting condition. It is only if | |
they are unequal that we can take action. | |
---> | |
<cfif (LOCAL.ValueOne NEQ LOCAL.ValueTwo)> | |
<!--- | |
Check to see which direction we are | |
sorting in. | |
---> | |
<cfif ( | |
( | |
(ARGUMENTS.Direction EQ "ASC") AND | |
(LOCAL.ValueOne GT LOCAL.ValueTwo) | |
) OR | |
( | |
(ARGUMENTS.Direction EQ "DESC") AND | |
(LOCAL.ValueOne LT LOCAL.ValueTwo) | |
))> | |
<!--- Swap nodes. ---> | |
<cfset ArraySwap( | |
LOCAL.TargetNodes, | |
LOCAL.Index, | |
(LOCAL.Index + 1) | |
) /> | |
</cfif> | |
<!--- | |
Break out of the comparison sub-loop | |
since we found two values which are | |
not equal. | |
---> | |
<cfbreak /> | |
</cfif> | |
</cfloop> | |
</cfloop> | |
<!--- END: Comparison loop. ---> | |
</cfloop> | |
<!--- END: Outer loop. ---> | |
<!--- | |
ASSERT: At this point our disconnected set of target nodes | |
is in the appropriate sort order. | |
---> | |
<!--- | |
Before we do anything, we need to set up a unique ID for | |
each XML node. This way, as we go through and compare | |
them, we can make unique comparisons. This is utilitarian | |
and will have to be removed afterwards. | |
---> | |
<cfloop | |
index="LOCAL.TargetNode" | |
array="#LOCAL.TargetNodes#"> | |
<!--- | |
Set unique ID. Use a function name space to make | |
sure we don't have have and attribute conflicts. | |
---> | |
<cfset LOCAL.TargetNode.XmlAttributes[ "XmlSort:UUID" ] = CreateUUID() /> | |
</cfloop> | |
<!--- | |
Now, it's time to take that set of nodes and apply the | |
order to the XML document. Because we don't know the | |
placement within the existing document, we need to | |
iterate over the parent node and check for matches. | |
---> | |
<cfset LOCAL.ParentNode = XmlSearch( | |
LOCAL.TargetNodes[ 1 ], | |
".." | |
) /> | |
<!--- Get the parent node from the returned node set. ---> | |
<cfset LOCAL.ParentNode = LOCAL.ParentNode[ 1 ] /> | |
<!--- | |
As we replace our children, we need to keep a pointer | |
to the index of the child we are replacing INTO the | |
existing document. As we go along, this will help us | |
to keep track since we don't know what order the matching | |
nodes are going to show up in. | |
---> | |
<cfset LOCAL.ReplaceIndex = 1 /> | |
<!--- Loop over the xml children. ---> | |
<cfloop | |
index="LOCAL.ChildIndex" | |
from="1" | |
to="#ArrayLen( LOCAL.ParentNode.XmlChildren )#" | |
step="1"> | |
<!--- | |
Check to see if this node is one of the nodes in | |
our target node set. | |
---> | |
<cfloop | |
index="LOCAL.TargetNode" | |
array="#LOCAL.TargetNodes#"> | |
<!--- | |
Check to see if this target node is the child | |
node that we are currently examining. When | |
comparing, make sure that the child node actually | |
has the sorting UUID. | |
---> | |
<cfif ( | |
StructKeyExists( LOCAL.ParentNode.XmlChildren[ LOCAL.ChildIndex ].XmlAttributes, "XmlSort:UUID" ) AND | |
(LOCAL.ParentNode.XmlChildren[ LOCAL.ChildIndex ].XmlAttributes[ "XmlSort:UUID" ] EQ LOCAL.TargetNode.XmlAttributes[ "XmlSort:UUID" ]) | |
)> | |
<!--- | |
The current child IS in our target node set. | |
That means that we can replace in one of our | |
target nodes in this child index. Use the | |
current index of replacement to select the | |
target node. | |
Use the post-incrementer to make sure that the | |
replce index is upped after we copy over the | |
taret node. | |
Use Duplicate() so that we don't mess up our | |
node references and accidentally delete one | |
of the nodes from the parent document. | |
---> | |
<cfset LOCAL.ParentNode.XmlChildren[ LOCAL.ChildIndex ] = Duplicate( LOCAL.TargetNodes[ LOCAL.ReplaceIndex++ ] ) /> | |
<!--- Remove the UUID. ---> | |
<cfset StructDelete( | |
LOCAL.ParentNode.XmlChildren[ LOCAL.ChildIndex ].XmlAttributes, | |
"XmlSort:UUID" | |
) /> | |
<!--- | |
We matched a node - no need to keep checking | |
for a match in our taret node set (there | |
should be a one-to-one match). | |
---> | |
<cfbreak /> | |
</cfif> | |
</cfloop> | |
</cfloop> | |
<!--- Return the updated XML document. ---> | |
<cfreturn ARGUMENTS.Xml /> | |
</cffunction> |
It's not pretty, but at least the arguments are much more appropriate.
Want to use code from this post? Check out the license.
Reader Comments
I didn't quite finish reading the new function code. Kudos for making it work without the parent reference. :)
Thought I'd throw in a couple of hints. First if you're not already familiar with it, there's a "quick sort" algorithm that may be faster than the bubble sort here -- there's a quicksort function on cflib with the code for doing that. Second, I don't think you need local.sortcopy near the top, I'm pretty sure you can do that in one line like this:
<cfset ARGUMENTS.SortXPath = [ ARGUMENTS.SortXPath ] />
@Ike,
I'll take a look at the quicker sorting algorithm; bubble sort is the only one I can remember from my Algorithms class 8 years ago :) Plus, it took me 3 mugs of tea just to get through the rest of this method!
Making something an implicit struct (or array) of itself should work, but unfortunately causes a strange bug in ColdFusion:
www.bennadel.com/index.cfm?dax=blog:1237.view
I am not 100% comfortable having to modify the XML nodes during the sorting algorithm, but as I am using a name-space to the method, I am not too worried about it.
What I am happy about in this method is that the target nodes can be mixed in with other nodes. In my previous attempts, the entire childNode set was ordered; in this one, only the target nodes get ordered regardless of where they are.
@Ike,
Just took a look at the Quick Sort algorithm. That's a tough one to visualize.
Yeah, I try not to visualize it. :P Basically, you just need a small piece that can order 2 items (as opposed to the entire set) and then you feed that into the quick sort algorithm and let it do its own thing. Like "Sue and Harry... Sue is taller, she's first". :)
That is a bizarre error... It's supposed to be a "nameless" array until it passes the equal sign, at which point it should reset the pointer, irrespective of what it was before... Oops! Somebody wasn't too careful with their encapsulation.
@Ike,
Yeah, same thing happens if you try to go from simple value to implicit struct as well.
The one thing that I'm not digging about the Quick Sort is that I need two functions just to do the sort plus whatever function is actually tying it all together. While the bubble sort may be slower, the benefit is that it is all contained within the target function.
Now, if ColdFusion had headless (anonymous) functions like Javascript, then we'd be in a different ball game :)
Well you only need a 2nd function if you're planning to use it the way the one on cflib was written. If you're okay with reproducing the quick sort code, then you can just swap out the dependency on the ternary function for having your ternary logic in-line. Yeah, it's liable to be more complex and ultimately it may not matter much given that we're talking about xml nodes and the size of the array may never really get large enough that you'd notice the difference. I just figured I'd throw it out there. :)
@Ike,
Thanks for the reality check :) It's true, we sometimes get caught up (as developers) in the theory of computer science and we forget that our universe of sortable sets will probably not be very large at all.
Still, I appreciate you pointing out the alternate sorting as its good to have in the bag.
Welcome. Yeah, I had put the quicksort function into the onTap framework libraries, so I end up using it as kind of a generic go-to when (not often) I do sorting because my stuff is all DLL-like that way. :)