Skip to main content
Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.

Update: Sorting Target XML Node Sets In ColdFusion XML Documents

By Ben Nadel on
Tags: ColdFusion

Ok, so last week, I was having some issues sorting target XML node sets in ColdFusion XML documents. I created a user defined function that would sort nodes given a parent node, but that never sat well with me; unfortunately, I didn't know how to fix it so that a target node set could be given rather than a parent node reference. Then, yesterday, Adam Cameron and Justin Carter suggested that if I use Duplicate() when copying XML node references, I won't create pointer-conflicts.

With this new information in hand, I re-tooled my XmlSort() ColdFusion user defined function to work using the XPath of a target node set rather than a parent node reference:

<!--- Create a ColdFusion XML document. --->
<cfxml variable="xmlData">

	<toes>
		<toe id="1">
			<cuteness>5</cuteness>
		</toe>
		<toe id="2">
			<cuteness>8</cuteness>
		</toe>
		<toe id="3">
			<cuteness>2</cuteness>
		</toe>
		<toe id="4">
			<cuteness>8</cuteness>
		</toe>
		<toe id="5">
			<cuteness>10</cuteness>
		</toe>
	</toes>

</cfxml>


<!--- Create sorting values. --->
<cfset arrSorting = [
	"cuteness",
	"@id"
	] />

<!--- Sort TOE nodes with cuteness descending. --->
<cfset xmlData = XmlSort2(
	xmlData,
	"//toe",
	arrSorting,
	"DESC"
	) />

<!--- Dump out resultant XML. --->
<cfdump
	var="#xmlData#"
	label="Xml In Cutness DESC Order."
	/>

Notice here that I want to sort the "toe" nodes; so, rather than passing it a "toes" (parent node) XPath, I pass in the reference to the actual target node set:

"//toe"

Then, the sorting paths are relative to each node of the target node set:

<!--- Create sorting values. --->
<cfset arrSorting = [
	"cuteness",
	"@id"
	] />

This makes my very happy since now the target XPath and the sorting XPath are each relative to the target node set and NOT to the parent node. This is the kind of uniformity that I wanted in the original XmlSort() attempt but was not smart enough to figure out.

When I run the above code, I get the following output:

XML Node Set Sorted Using Target Node XPath And Relative XPath Sorting.

As you can see, the XML nodes are sorted in descending order according to the XmlText of the "cuteness" node. Notice also that in the case where the two toes have the same cuteness, the deciding factor is the second sort option, the ID attribute of the toe node.

Sweet, it worked! Ok, it worked, but the UDF for this is not exactly pretty; it's long and convoluted, but it gets the job done:

<cffunction
	name="XmlSort2"
	access="public"
	returntype="xml"
	output="true"
	hint="I sort part of an XML documument based on the given XPath and sort characteristics.">

	<!--- Define arguments. --->
	<cfargument
		name="Xml"
		type="any"
		required="true"
		hint="I am an XML string or ColdFusion XML document."
		/>

	<cfargument
		name="XPath"
		type="string"
		required="true"
		hint="I am the XPath to the target node set." />

	<cfargument
		name="SortXPath"
		type="any"
		required="false"
		default="text()"
		hint="I am the XPath value upon which the sort is being conducted. This can be a string or an array (if multiple sorting options are required)."
		/>

	<cfargument
		name="Direction"
		type="string"
		required="false"
		default="ASC"
		hint="I am the sort direction. ASC or DESC."
		/>

	<!--- Define the local scope. --->
	<cfset var LOCAL = {} />


	<!---
		Check to see if our XML document is an Xml document.
		If is not, then we want to convert it into a true
		ColdFusion XML document.
	--->
	<cfif NOT IsXmlDoc( ARGUMENTS.Xml )>

		<!--- Convert to an XML document. --->
		<cfset ARGUMENTS.Xml = XmlParse(
			Trim( ARGUMENTS.Xml )
			) />

	</cfif>


	<!---
		Check to see if the given sorting option is a string or
		an array. If it's a string, then let's convert it to an
		array so that we can treat it uniformily later on.
	--->
	<cfif IsSimpleValue( ARGUMENTS.SortXPath )>

		<!---
			We need to copy this to get around a bug in the way
			ColdFusion handles implicit array creation involving
			its own values.
		--->
		<cfset LOCAL.SortCopy = ARGUMENTS.SortXPath />

		<!--- Convert simple value to an array. --->
		<cfset ARGUMENTS.SortXPath = [ LOCAL.SortCopy ] />

	</cfif>


	<!--- Get the set of target nodes. --->
	<cfset LOCAL.TargetNodes = XmlSearch(
		ARGUMENTS.Xml,
		ARGUMENTS.XPath
		) />


	<!---
		Check to make sure we have target nodes. If not,
		then just return the original XML.
	--->
	<cfif (ArrayLen( LOCAL.TargetNodes ) LT 2)>

		<cfreturn ARGUMENTS.Xml />

	</cfif>


	<!---
		ASSERT: At this point, we know that we have a valid
		set of target nodes that can be sorted.
	--->


	<!--- Perform bubble sort on target nodes. --->
	<cfloop
		index="LOCAL.EndIndex"
		from="#(ArrayLen( LOCAL.TargetNodes ) - 1)#"
		to="1"
		step="-1">

		<!---
			Loop over nodes starting at 1 and then proceeding
			until we reach the "end index". This way, we will
			not duplicate our comparison of the last value.
		--->
		<cfloop
			index="LOCAL.Index"
			from="1"
			to="#LOCAL.EndIndex#"
			step="1">

			<!--- Get the two target nodes. --->
			<cfset LOCAL.NodeOne = LOCAL.TargetNodes[ LOCAL.Index ] />
			<cfset LOCAL.NodeTwo = LOCAL.TargetNodes[ LOCAL.Index + 1 ] />


			<!---
				Now that we have the two target nodes, we have
				to sort them based on the array of comparison
				values. We only need to proceed through the
				comparisons if the previous comparison is equal.
			--->
			<cfloop
				index="LOCAL.SortXPath"
				array="#ARGUMENTS.SortXPath#">

				<!--- Get the comparison value for given node. --->
				<cfset LOCAL.ValueOne = XmlSearch(
					LOCAL.NodeOne,
					LOCAL.SortXPath
					) />

				<!--- Get the comparison value for next node. --->
				<cfset LOCAL.ValueTwo = XmlSearch(
					LOCAL.NodeTwo,
					LOCAL.SortXPath
					) />


				<!---
					Now, we have to get our values down to
					something that is usable. If this is a node,
					then get the text value. If this is an
					attribute then get the attribute value.
				--->
				<cfif StructKeyExists(
					LOCAL.ValueOne[ 1 ],
					"XmlText"
					)>

					<!--- Get node text. --->
					<cfset LOCAL.ValueOne = LOCAL.ValueOne[ 1 ].XmlText />
					<cfset LOCAL.ValueTwo = LOCAL.ValueTwo[ 1 ].XmlText />

				<cfelse>

					<!--- Get attribute value. --->
					<cfset LOCAL.ValueOne = LOCAL.ValueOne[ 1 ].XmlValue />
					<cfset LOCAL.ValueTwo = LOCAL.ValueTwo[ 1 ].XmlValue />

				</cfif>


				<!---
					Check to see if these two values are equal.
					If they are, then we just want to proceed
					to the next sorting condition. It is only if
					they are unequal that we can take action.
				--->
				<cfif (LOCAL.ValueOne NEQ LOCAL.ValueTwo)>

					<!---
						Check to see which direction we are
						sorting in.
					--->
					<cfif (
						(
							(ARGUMENTS.Direction EQ "ASC") AND
							(LOCAL.ValueOne GT LOCAL.ValueTwo)
						) OR
						(
							(ARGUMENTS.Direction EQ "DESC") AND
							(LOCAL.ValueOne LT LOCAL.ValueTwo)
						))>

						<!--- Swap nodes. --->
						<cfset ArraySwap(
							LOCAL.TargetNodes,
							LOCAL.Index,
							(LOCAL.Index + 1)
							) />

					</cfif>


					<!---
						Break out of the comparison sub-loop
						since we found two values which are
						not equal.
					--->
					<cfbreak />

				</cfif>

			</cfloop>


		</cfloop>
		<!--- END: Comparison loop. --->

	</cfloop>
	<!--- END: Outer loop. --->


	<!---
		ASSERT: At this point our disconnected set of target nodes
		is in the appropriate sort order.
	--->


	<!---
		Before we do anything, we need to set up a unique ID for
		each XML node. This way, as we go through and compare
		them, we can make unique comparisons. This is utilitarian
		and will have to be removed afterwards.
	--->
	<cfloop
		index="LOCAL.TargetNode"
		array="#LOCAL.TargetNodes#">

		<!---
			Set unique ID. Use a function name space to make
			sure we don't have have and attribute conflicts.
		--->
		<cfset LOCAL.TargetNode.XmlAttributes[ "XmlSort:UUID" ] = CreateUUID() />

	</cfloop>


	<!---
		Now, it's time to take that set of nodes and apply the
		order to the XML document. Because we don't know the
		placement within the existing document, we need to
		iterate over the parent node and check for matches.
	--->
	<cfset LOCAL.ParentNode = XmlSearch(
		LOCAL.TargetNodes[ 1 ],
		".."
		) />

	<!--- Get the parent node from the returned node set. --->
	<cfset LOCAL.ParentNode = LOCAL.ParentNode[ 1 ] />

	<!---
		As we replace our children, we need to keep a pointer
		to the index of the child we are replacing INTO the
		existing document. As we go along, this will help us
		to keep track since we don't know what order the matching
		nodes are going to show up in.
	--->
	<cfset LOCAL.ReplaceIndex = 1 />


	<!--- Loop over the xml children. --->
	<cfloop
		index="LOCAL.ChildIndex"
		from="1"
		to="#ArrayLen( LOCAL.ParentNode.XmlChildren )#"
		step="1">

		<!---
			Check to see if this node is one of the nodes in
			our target node set.
		--->
		<cfloop
			index="LOCAL.TargetNode"
			array="#LOCAL.TargetNodes#">


			<!---
				Check to see if this target node is the child
				node that we are currently examining. When
				comparing, make sure that the child node actually
				has the sorting UUID.
			--->
			<cfif (
				StructKeyExists( LOCAL.ParentNode.XmlChildren[ LOCAL.ChildIndex ].XmlAttributes, "XmlSort:UUID" ) AND
				(LOCAL.ParentNode.XmlChildren[ LOCAL.ChildIndex ].XmlAttributes[ "XmlSort:UUID" ] EQ LOCAL.TargetNode.XmlAttributes[ "XmlSort:UUID" ])
				)>

				<!---
					The current child IS in our target node set.
					That means that we can replace in one of our
					target nodes in this child index. Use the
					current index of replacement to select the
					target node.

					Use the post-incrementer to make sure that the
					replce index is upped after we copy over the
					taret node.

					Use Duplicate() so that we don't mess up our
					node references and accidentally delete one
					of the nodes from the parent document.
				--->
				<cfset LOCAL.ParentNode.XmlChildren[ LOCAL.ChildIndex ] = Duplicate( LOCAL.TargetNodes[ LOCAL.ReplaceIndex++ ] ) />

				<!--- Remove the UUID. --->
				<cfset StructDelete(
					LOCAL.ParentNode.XmlChildren[ LOCAL.ChildIndex ].XmlAttributes,
					"XmlSort:UUID"
					) />

				<!---
					We matched a node - no need to keep checking
					for a match in our taret node set (there
					should be a one-to-one match).
				--->
				<cfbreak />

			</cfif>

		</cfloop>

	</cfloop>


	<!--- Return the updated XML document. --->
	<cfreturn ARGUMENTS.Xml />
</cffunction>

It's not pretty, but at least the arguments are much more appropriate.



Reader Comments

I didn't quite finish reading the new function code. Kudos for making it work without the parent reference. :)

Thought I'd throw in a couple of hints. First if you're not already familiar with it, there's a "quick sort" algorithm that may be faster than the bubble sort here -- there's a quicksort function on cflib with the code for doing that. Second, I don't think you need local.sortcopy near the top, I'm pretty sure you can do that in one line like this:

<cfset ARGUMENTS.SortXPath = [ ARGUMENTS.SortXPath ] />

@Ike,

I'll take a look at the quicker sorting algorithm; bubble sort is the only one I can remember from my Algorithms class 8 years ago :) Plus, it took me 3 mugs of tea just to get through the rest of this method!

Making something an implicit struct (or array) of itself should work, but unfortunately causes a strange bug in ColdFusion:

www.bennadel.com/index.cfm?dax=blog:1237.view

I am not 100% comfortable having to modify the XML nodes during the sorting algorithm, but as I am using a name-space to the method, I am not too worried about it.

What I am happy about in this method is that the target nodes can be mixed in with other nodes. In my previous attempts, the entire childNode set was ordered; in this one, only the target nodes get ordered regardless of where they are.

Yeah, I try not to visualize it. :P Basically, you just need a small piece that can order 2 items (as opposed to the entire set) and then you feed that into the quick sort algorithm and let it do its own thing. Like "Sue and Harry... Sue is taller, she's first". :)

That is a bizarre error... It's supposed to be a "nameless" array until it passes the equal sign, at which point it should reset the pointer, irrespective of what it was before... Oops! Somebody wasn't too careful with their encapsulation.

@Ike,

Yeah, same thing happens if you try to go from simple value to implicit struct as well.

The one thing that I'm not digging about the Quick Sort is that I need two functions just to do the sort plus whatever function is actually tying it all together. While the bubble sort may be slower, the benefit is that it is all contained within the target function.

Now, if ColdFusion had headless (anonymous) functions like Javascript, then we'd be in a different ball game :)

Well you only need a 2nd function if you're planning to use it the way the one on cflib was written. If you're okay with reproducing the quick sort code, then you can just swap out the dependency on the ternary function for having your ternary logic in-line. Yeah, it's liable to be more complex and ultimately it may not matter much given that we're talking about xml nodes and the size of the array may never really get large enough that you'd notice the difference. I just figured I'd throw it out there. :)

@Ike,

Thanks for the reality check :) It's true, we sometimes get caught up (as developers) in the theory of computer science and we forget that our universe of sortable sets will probably not be very large at all.

Still, I appreciate you pointing out the alternate sorting as its good to have in the bag.

Welcome. Yeah, I had put the quicksort function into the onTap framework libraries, so I end up using it as kind of a generic go-to when (not often) I do sorting because my stuff is all DLL-like that way. :)