Ask Ben: Grouping Query Output By Column Value In ColdFusion

By Ben Nadel

Published 2022-02-11 in Ask Ben, ColdFusion

I recently received a question about a mysterious group behavior on the ColdFusion CFLoop tag when used in conjunction with CFIncludeed templates:

Hoping you could help with a problem I'm having. I have a main file with a CFLoop query group in it. Inside the CFLoop, I'm CFIncludeing a different file with an inner <cfloop>/<cfloop> in it to output over the row-grouping. However, it's only outputting 1 row (per group) instead of looping over the entire set of grouped rows.

To be honest, I have very little experience with the group attribute on the CFLoop tag. I wrote about it once 10-years ago; but, that may have been the last time that I experimented with it. There was always a little "too much magic" in the way that it managed the query iteration.

That said, with a little back-and-forth from the reader, I was able to replicate the problem in my ColdFusion 2021 setup. And, it does appear to be a bug in the way ColdFusion is managing the query iteration across page-contexts (templates). This wouldn't be the first time I experienced odd behavior with CFInclude and unscoped variables.

I don't have a "fix" for this problem. That is to say, I don't know how to solve this problem using the group attribute on the CFLoop tag. That said, I do have a working solution that uses a slightly different approach.

In a situation like this, instead of using the group attribute I create a data structure that matches the intent of the group attribute. I create a Struct that indexes the rows based on a given column. This is heavily inspired by LoDash's _.groupBy() function wherein each "group key" points to an Array of values that all have the same "group key".

This version of the code uses ColdFusion 2021 which is capable of creating Ordered structs. Normally, when iterating over a Struct, there's no guaranteed order in which the iteration will be performed. With an Ordered Struct, however, key-iteration is guaranteed to match the order in which the keys were defined. In this demo, I'm using an Ordered Struct so that the iteration over the "group index" matches the original iteration order of the ColdFusion query object.

The original question involved a CFInclude; but, for the sake of simplicity, I'm going to use a single file for the demo. That said, you can trust that the variables defined in the for loop can be referenced from within an included template.

<cfscript>

	// Our sample data here contains orders from various stores. We want to iterate over
	// this query so that we can get information about each GROUPING of stores.
	retailers = queryNew(
		data = [
			{ "storeName": "Acme", "orderID": 1 },
			{ "storeName": "Super Store", "orderID": 3 },
			{ "storeName": "Acme", "orderID": 2 },
			{ "storeName": "Hi-Tech Store", "orderID": 999 },
			{ "storeName": "Super Store", "orderID": 4 },
			{ "storeName": "Super Store", "orderID": 5 }
		]
	);

	// ------------------------------------------------------------------------------- //
	// ------------------------------------------------------------------------------- //	

	// Group the query on the store name. This will give us a STRUCT in which each each
	// store name "key" points to an ARRAY of rows associated with that store name.
	retailersIndex = groupQueryBy( retailers, "storeName" );

	// Our index is using an ORDERED STRUCT so that our for-loop here will iterate over
	// the stores in the same order they appeared in the original query.
	for ( storeName in retailersIndex ) {

		// Get all of the orders that were keyed on this store name.
		// --
		// CAUTION: This is slightly different than the native "group" attribute on the
		// query in that we will have all rows with the given key, not just rows that are
		// right next to each other.
		orders = retailersIndex[ storeName ];

		writeOutput( "<h2> #storeName# </h2>" );
		writeOutput( "<h3> #numberFormat( orders.len() )# Total Order(s) </h3>" );

		for ( order in orders ) {

			writeOutput( "- Order ID : #order.orderID# <br />" );

		}

		// I tested the include-base approach locally (it works). But, I'm only using a
		// single file for the demo.
		// --
		// include "./inner-test.cfm";

	}

	// ------------------------------------------------------------------------------- //
	// ------------------------------------------------------------------------------- //

	/**
	* I return an ORDERED STRUCT that groups the given query on the given column name.
	* Each entry in the resultant index contains an array of corresponding rows.
	*/
	public struct function groupQueryBy(
		required query rows,
		required string columnName
		) {

		// NOTE: We're using an ORDERED STRUCT here so that the iteration order of the
		// grouping matches the iteration order of the original query object.
		var index = [:];

		for ( var row in rows ) {

			var groupKey = row[ columnName ];

			// If we've already encountered a row with the same group-key, add the row to
			// the existing aggregate.
			// --
			// CAUTION: This behavior differs slightly from the native "group" attribute
			// in that rows do NOT have to be side-by-side. This approach will aggregate
			// any and all rows with the same key, regardless of where they are in the
			// overall query.
			if ( index.keyExists( groupKey ) ) {

				index[ groupKey ].append( row );

			// If we haven't encountered this group-key yet, create a new aggregate using
			// the current row. All subsequent matches will be added to this aggregate.
			} else {

				index[ groupKey ] = [ row ];

			}

		}

		return( index );

	}

</cfscript>

As you can see, our groupQueryBy() function performs a single iteration over the query in order to generate the column-based index. As such, this approach does have some overhead in that you are now iterating over the query twice: once to generate the index and once to output the rows. That said, this will likely be inconsequential from a performance standpoint.

Once we have a our group index, iterating over the rows becomes a simple nested for-loop (altered here to provide a high-level abstract outline):

<cfscript>

	for ( storeName in retailersIndex ) {

		// ... store details ...

		for ( order in retailersIndex[ storeName ] ) {

			// ... order details ...

		}
	}

</cfscript>

And, when we run this ColdFusion 2021 code against our demo data, we get the following output:

## Acme

### 2 Total Order(s)

- Order ID : 1
- Order ID : 2

## Super Store

### 3 Total Order(s)

- Order ID : 3
- Order ID : 4
- Order ID : 5

## Hi-Tech Store

### 1 Total Order(s)

- Order ID : 999

As you can see, for each unique store name, we were able to iterate over the aggregate set of orders for that store.

Again, this isn't exactly like using the CFLoop tag with a group attribute. But, I believe this matches the intent of that behavior. And, this is the same approach that I use in my ColdFusion applications all the time. In fact, three of my favorite utility methods in ColdFusion are:

groupBy( collection, key )
indexBy( collection, key )
reflect( collection )

... all very much influenced by LoDash's functional paradigm. I find that once I can transform data into a more consumable structure, it opens up a lot of possibilities.

Want to use code from this post? Check out the license.

Short link: https://bennadel.com/4205

Reader Comments

Oh my chickens, this post is old!

Hit me up on Twitter if you want to discuss it further.