Skip to main content
Ben Nadel at the jQuery Conference 2010 (Boston, MA) with: Blain Smith
Ben Nadel at the jQuery Conference 2010 (Boston, MA) with: Blain Smith

Inverting Your Thinking About List Parsing In ColdFusion

By
Published in

One of the many joyful features in ColdFusion is the fact that lists of comma-delimited values receive a first-class treatment in the language. That is, we have many constructs for creating, parsing, and mutating lists despite the fact that they are just string values. For the most part, thinking about lists in terms of their delimiters is straightforward. After all, the vast majority of lists are just comma separated values. But, some lists can be more intricate. And, in those cases, it can help to invert your thinking about lists by parsing out the items rather than the delimiters.

In ColdFusion, strings are inherently multi-line. Which means, a string can include line-breaks without the ceremony of escape sequences. This feature can be used to define lists that are human-friendly. Consider the following ColdFusion code in which we need to parse a list that includes tabs, spaces, and line-breaks in addition to the standard comma delimiter:

<cfscript>
fieldList = "
id ,
name ,
code ,
createdAt
";
// Treat the input as a delimited list with several delimiters. In ColdFusion, the
// default behavior is to collapse several sequential delimiters down into a single
// delimiter. As such, the comma followed by several white-space characters (newline,
// tab, etc) will all become one delimiter (for all intents and purposes).
fields = fieldList.listToArray( ",#chr( 9 )##chr( 10 )##chr( 13 )##chr( 32 )#" );
dump( label = "re: List Delimiters", var = fields );
</cfscript>
view raw test.cfm hosted with ❤ by GitHub

We can still parse our fieldList string variable as a delimited list, no problem; but, we need to include both the comma and all of the whitespace characters when parsing the list. By default, ColdFusion will just collapse a series of delimiters down into a single delimiter so that we don't end up with a bunch of zero-length list-items:

CFDump of an array of strings demonstrating a successful parsing of the list using multiple delimiters.

That worked perfectly. But, defining all those delimiters is a bit of a pain. In this case, it would be easier to think about the list in terms of the items rather than the delimiters. Instead of worrying about all the various whitespace characters, we can think of the important parts as a sequence of "word characters". Then, we can use simple regular expression patterns to extract the "items" using reMatch():

<cfscript>
fieldList = "
item_id ,
item_name ,
item_code ,
item_createdAt
";
// Instead of thinking about the list in terms of its delimiters, we can invert our
// thinking and consider the list in terms of the items. For lists with a variety of
// delimiters, this can make the parsing much easier (for humans) to read. In this
// case, we'll use regular expression matching to extract all "word characters".
fields = fieldList.reMatch( "\w+" );
// We could have also used the following:
_fields = fieldList.reMatch( "[^,\s]+" ); // NOT comma or whitespace.
_fields = fieldList.reMatchNoCase( "[a-z_]+" );
dump( label = "re: List Items", var = fields );
</cfscript>
view raw test2.cfm hosted with ❤ by GitHub

By inverting our thinking, and moving from .listToArray() to .reMatch(), we still end up with the same output:

CFDump of an array of strings demonstrating a successful parsing of the list using regular expression extraction.

But, we greatly simplified our parsing. Let's look at the two approaches side by side:

<cfscript>
// Delimiter-oriented thinking.
fieldList.listToArray( ",#chr( 9 )##chr( 10 )##chr( 13 )##chr( 32 )#" );
// Item-oriented thinking.
fieldList.reMatch( "\w+" );
</cfscript>
view raw snippet-1.cfm hosted with ❤ by GitHub

To be clear, I'm not saying that inverting your consideration of lists is always warranted. In fact, it may not be warranted in most cases (like we said above, the vast majority of lists are just comma-separated items). But, it's helpful to know that you can invert your thinking once your set of delimiters passes a tipping point of complexity.

Want to use code from this post? Check out the license.

Reader Comments

Post A Comment — I'd Love To Hear From You!

Markdown formatting: Basic formatting is supported: bold, italic, blockquotes, lists, fenced code-blocks. Read more about markdown syntax »
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.
Cancel
I believe in love. I believe in compassion. I believe in human rights. I believe that we can afford to give more of these gifts to the world around us because it costs us nothing to be decent and kind and understanding. And, I want you to know that when you land on this site, you are accepted for who you are, no matter how you identify, what truths you live, or whatever kind of goofy shit makes you feel alive! Rock on with your bad self!
Ben Nadel