Using A Closure To Encapsulate Depth-First Tree Traversal In Lucee CFML 5.3.3.62

By Ben Nadel

Published 2020-01-10 in ColdFusion

In my last couple of Lucee CFML posts, I've looked at using a while loop to recursively iterate over a tree structure without creating a deep (and expensive) call-stack. I've done this using both a breadth-first algorithm and a depth-first algorithm. In both cases, the algorithm for the traversal of the tree was tightly coupled to the way in which each tree-node was being consumed. This made it hard to reuse the traversal logic. As such, I wanted to try extracting the traversal itself, making it consumable with a closure or callback in Lucee CFML 5.3.3.62.

In my previous posts, each tree node was consumed right in the middle of the recursive while loop. To make this traversal more flexible and reusable, I want to replace the inline consumption with a callback-based consumption. This means that my tree node inspection goes from this (pseudo-code):

<cfscript>

	while ( hasNode ) {

		// ....
		echo( "#node.xmlName# " ); // INLINE consumption of node.
		// ....

	}

</cfscript>

... to this (pseudo-code):

<cfscript>

	function callback( node ) {

		echo( "#node.xmlName# " );

	}

	// ....

	while ( hasNode ) {

		// ....
		callback( node ); // CALLBACK consumption of node.
		// ....

	}

</cfscript>

By moving the inline consumption logic out of the while loop and replacing it with a callback, I can now reuse the tree traversal algorithm in a variety of places by passing-in a different callback.

To see this in action, let's look at a more robust example in which we will parse an XML document and then perform a depth-first traversal of the nodes using a Lucee CFML Closure:

<cfscript>

	// Setup our Tree for demonstration purposes. Each node in this tree is named based
	// on its depth and order within the tree structure.
	tree = xmlParse("
		<a1>
			<b1>
				<c1>
					<d1 />
					<d2 />
					<d3 />
				</c1>
			</b1>
			<b2 />
			<b3>
				<c2 />
			</b3>
			<b4>
				<c3>
					<d4>
						<e1 />
						<e2>
							<f1 />
						</e2>
						<e3 />
					</d4>
					<d5 />
				</c3>
			</b4>
			<b5 />
		</a1>
	");

	// Use the walkTree() function to "visit" each xmlNode in the XML Document. This
	// function will perform a depth-first traversal of the given document and invoke the
	// given closure for each node that it visits.
	walkTree(
		tree,
		( node ) => {

			echo( "#node.xmlName# " );

		}
	);
	dump( tree );
	
	// ------------------------------------------------------------------------------- //
	// ------------------------------------------------------------------------------- //

	/**
	* I perform a depth-first iteration of the given XML Document, invoking the given
	* callback for each XML Node that is visited.
	* 
	* @tree I am the XML Document being walked.
	* @callback I am the function being invoked for each node visited.
	*/
	public void function walkTree(
		required xml tree,
		required function callback
		) {

		// The "recursive" nature of this tree traversal will be powered by this queue so
		// that we don't have to create a deep call-stack. As each node is visited, its
		// child-nodes will be prepended to the queue for further traversal.
		var queue = [ tree.xmlRoot ];

		// Continue pulling nodes out of the queue until we have visited the entire tree.
		while ( queue.isDefined( 1 ) ) {

			var node = queue.first();
			queue.deleteAt( 1 );

			// If there are any child nodes, queue them up for further exploration.
			// --
			// NOTE: Prepending nodes creates a depth-first traversal and appending nodes
			// creates a breadth-first traversal. For the sake of this demo, I'm using
			// the .merge() method to make this easy (recreating the queue reference);
			// the .prepend(merge) operation would have been ideal, but it adds the nodes
			// in reverse order.
			if ( node.xmlChildren.len() ) {

				queue = node.xmlChildren.merge( queue );

			}

			// Invoke the callback on each node that we visit.
			callback( node );

		}

	}

</cfscript>

As you can see, when we call the walkTree() method, we're passing-in both the XML Document and a Closure. This Closure accepts an XML Node, and is invoked for each node that is visited by the walkTree() tree traversal algorithm. By pulling the node consumption out of the walkTree() function and putting it in a callback, I can now reuse the walkTree() function in a variety of different ways.

And, when we run the above ColdFusion code, we get the following page output:

A depth-first traversal of an XML document using Lucee CFML 5.3.3.62.

As you can see by the node-names output along the time, our closure was successfully called in a depth-first manner, once for each node in the XML document.

Now, anytime we move from an inline algorithm to a callback-based algorithm, my immediate next thought is, Will this affect performance?. Obviously, we don't want to prematurely optimize our code; and, we definitely don't want to underestimate the benefit of creating a reusable algorithm for tree traversal. But, it would be interesting to see what kind of performance hit we take by using this more flexible approach.

Last year, I took a look at the overhead of managing a Jedis connection-pool resource using a Closure in Lucee CFML 5.3.2.77. I felt like that was a decent test scenario; so, I'm just going to copy the same test here.

PERFORMANCE CAVEAT: As with all micro load-testing scenarios, the results should be taken with caution. Testing in a local development environment, that is not under production load, is never a perfect representation of real-world performance.

In the following code, I'm going to perform 10 test in which each test performs 1,000 tree-traversals with each approach: an inline consumption and a closure-based consumption. The duration of each test will then be output to the screen where we can evaluate it manually:

<cfscript>
	
	// Setup our Tree for demonstration purposes. Each node in this tree is named based
	// on its depth and order within the tree structure.
	tree = xmlParse("
		<a1>
			<b1>
				<c1>
					<d1 />
					<d2 />
					<d3 />
				</c1>
			</b1>
			<b2 />
			<b3>
				<c2 />
			</b3>
			<b4>
				<c3>
					<d4>
						<e1 />
						<e2>
							<f1 />
						</e2>
						<e3 />
					</d4>
					<d5 />
				</c3>
			</b4>
			<b5 />
		</a1>
	");

	// ------------------------------------------------------------------------------- //
	// ------------------------------------------------------------------------------- //

	testCount = 10;
	iterationCount = 1000;

	// For each test, we're going to run one grouping of the INLINE traversal and one
	// grouping of the CALLBACK traversal. The duration (in milliseconds) of each
	// grouping will be output to the screen.
	loop times = testCount {

		// INLINE traversal speed test.
		stopwatch variable = "duration" {

			loop times = iterationCount {

				walkTree( tree );

			}

		}

		echo( "Inline: #duration#ms <br />" );


		// CALLBACK traversal speed test.
		stopwatch variable = "duration" {

			loop times = iterationCount {

				walkTree(
					tree,
					( node ) => {

						systemOutput( "Visiting: #node.xmlName#" );

					}
				);

			}

		}

		echo( "Callback: #duration#ms <br /><br />" );

	} // END: Test-Loop.

	// ------------------------------------------------------------------------------- //
	// ------------------------------------------------------------------------------- //

	/**
	* I perform a depth-first iteration of the given XML Document.
	* 
	* IF A CALLBACK IS PROVIDED, the callback is invoked for each node that is visited.
	* Otherwise, if no callback is provided, the node name will be written directly to
	* the server's stdout.
	* 
	* @tree I am the XML Document being walked.
	* @callback OPTIONAL I am the function being invoked for each node visited.
	*/
	public void function walkTree(
		required xml tree,
		function callback
		) {

		// The "recursive" nature of this tree traversal will be powered by this queue so
		// that we don't have to create a deep call-stack. As each node is visited, its
		// child-nodes will be prepended to the queue for further traversal.
		var queue = [ tree.xmlRoot ];

		// Continue pulling nodes out of the queue until we have visited the entire tree.
		while ( queue.isDefined( 1 ) ) {

			var node = queue.first();
			queue.deleteAt( 1 );

			// If there are any child nodes, queue them up for further exploration.
			// --
			// NOTE: Prepending nodes creates a depth-first traversal and appending nodes
			// creates a breadth-first traversal. For the sake of this demo, I'm using
			// the .merge() method to make this easy (recreating the queue reference);
			// the .prepend(merge) operation would have been ideal, but it adds the nodes
			// in reverse order.
			if ( node.xmlChildren.len() ) {

				queue = node.xmlChildren.merge( queue );

			}

			// If a callback was provided, invoke it for the node; otherwise, just
			// perform the same function of the callback directly.
			( isNull( callback ) )
				? systemOutput( "Visiting: #node.xmlName# " ) // INLINE TEST.
				: callback( node ) // CALLBACK TEST.
			;

		}

	}

</cfscript>

As you can see, inside each test iteration, we're invoking the walkTree() 1,000 for each approach. In the inline case, we're using a systemOutput() call directly from within the walkTree() function; and, in the callback case, we're passing-in the systemOutput() call (so to speak). And, when we run this ColdFusion code, we get the following output:

Simple performance test demonstrating that the closure-based consumption of the tree-traveral is just as fast in Lucee CFML 5.3.3.62.

As you can see, in almost half of the test cases, the closure-based consumption of the tree-traversal algorithm was just as fast - if not faster - than the inline consumption. This means that we can take peace of mind in knowing that, even if we focus on creating flexible, reusable algorithms, they are still going to be fast in Lucee CFML. Closure invocation overhead is just not a thing to be worried about.

Closures are just wonderful constructs. They give us the ability to pass around cohesive chunks of logic that make it possible to create very flexible and reusable algorithms in Lucee CFML. In this case, we looked at using a ColdFusion Closure to separate node consumption logic from tree traversal logic, allowing us to reuse the tree traversal logic in a variety of contexts.

Want to use code from this post? Check out the license.

Short link: https://bennadel.com/3755

Reader Comments

Oh my chickens, this post is old!

Hit me up on LinkedIn if you want to discuss it further.