Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
Ben Nadel at cf.Objective() 2012 (Minneapolis, MN) with: Ryan Anklam
Ben Nadel at cf.Objective() 2012 (Minneapolis, MN) with: Ryan Anklam@bittersweetryan )

Experimenting With Asynchronous Data Access Using Parallel Array Iteration In Lucee 5.3.2.77

By Ben Nadel on
Tags: ColdFusion

At work, we recently migrated from Adobe ColdFusion 10 to Lucee CFML 5. With such a significant jump in ColdFusion engines, there are a lot of new features that I've yet to even experiment with. One of the features that I'm most excited about at the moment is the ability to iterate over arrays using parallel threads. While not directly related to data access, this feature opens up the ability to easily execute data access methods (ie, database queries) in parallel, opening the door for the asynchronous efficiencies that we've grown so accustomed to in the JavaScript ecosystem. This post is just my first look at such features.

By default, ColdFusion queries - and most of ColdFusion's I/O operations - are blocking. That is, the calling context blocks and waits for the operation to complete before the calling context moves onto the next statement. Since ColdFusion 8, we've had the ability to wrap synchronous code in a cfthread tag, allowing it to run asynchronously. But, with parallel array iteration, we have the opportunity to use lower-level threading and potentially simplify and improve parallel data access.

To see this in action, I've created a simple demo in which I have to map individual User IDs on database queries. The mapping of IDs onto Query objects is going to be performed using the array.map() method. In one test, we'll run the iteration in serial; and, in another test, we'll run the iteration in parallel.

In order to make the results a little more exaggerated, I'm adding a SLEEP(1) function to the MySQL query in order to force it to sleep for a second before it returns:

<cfscript>

	// NOTE: I am defining the query options in the VARIABLES scope to see if the
	// parallel threads / array.map() closure holds onto the same PAGE scope.
	variables.queryOptions = {
		datasource: "testing"
	};

	/**
	* I get the user with the given ID.
	* 
	* @id I am the user being requested.
	* @output false
	*/
	public query function getUserByID( required numeric id ) {

		// In order to see the affects of the parallel data-access, I am going to sleep
		// each query for a short period (1-second). This will allow the query latency to
		// be more pronounced.
		var record = queryExecute(
			"
				SELECT
					u.id,
					u.name,

					-- Simulated latency, throw-away column.
					SLEEP( 1 )
				FROM
					user u
				WHERE
					u.id = :id
				;
			",
			{
				id: id
			},
			queryOptions
		);

		return( record );

	}

	// ------------------------------------------------------------------------------- //
	// ------------------------------------------------------------------------------- //

	/**
	* I map the given user IDs onto an array of user query objects.
	* 
	* @userIDs I am the users being queried.
	* @parallel I determine if the queries should be executed in parallel.
	* @output true
	*/
	public void function runExperiment(
		required array userIDs,
		required boolean parallel
		) {

		var startedAt = getTickCount();

		// Map IDs onto query objects. This is the EXCITING part! Each iteration of the
		// .map() operator is going to sleep for 1-second. As such, running them in
		// parallel should have a significant performance advantage.
		var users = userIDs.map( getUserByID, parallel );

		writeOutput(
			"<p>
				Collected users in #numberFormat( getTickCount() - startedAt )# ms
				( parallel: #parallel# ).
			</p>"
		);

		for ( var user in users ) {

			writeOutput( "- #user.name# <br />" );

		}

	}

	// ------------------------------------------------------------------------------- //
	// ------------------------------------------------------------------------------- //

	runExperiment( [ 1, 2, 3, 4 ], false );
	runExperiment( [ 1, 2, 3, 4 ], true ); // Running in parallel.

</cfscript>

As you can see, for each array of User IDs, we're calling the .map() method on it, passing in our data-access function as the map operator. The real differentiator is whether or not the second argument is set to true (parallel iteration) or false (serial iteration):

userIDs.map( getUserByID, parallel )

If we then run this Lucee CFML code in the browser, we get the following output:

Parallel array iteration in Lucee 5 allows for reduces data access times.

As you can see, when we run the array iteration in serial, the simulated latency is aggregated across the entire traversal, taking over 4-seconds to complete. But, when we run the array iteration in parallel, the entire traversal takes about 1/4 the amount of time (ie, it is about 4x faster!).

This is some very exciting stuff! I'll have to play around with it a lot more to get comfortable with how I would need to structure my ColdFusion components in order to provide for good "developer ergonomics" and improved performance. But, I'm already very excited at the thought of being able to spawn asynchronous data access operations with very little ceremony.

Epilogue no Errors During Parallel Iteration

I wanted to see what would happens if a given array iteration operation threw an error. It appears that the first error immediately bubbles up to the calling context, which means that even parallel iteration can be safely wrapped in a try/catch block. This already makes parallel iteration much easier to work with, when compared to the traditional cfthread tag.

Also, if one iteration throws an error, the other parallel iterations continue to execute.



Reader Comments

Hey Ben, great stuff from you, as always. Thanks.

FWIW, while you mention this parallel option for arrays as an alternative to cfthread for async operations, it seems worth mentioning (for the sake of readers readers new to the topic) that both Lucee and CF2018 also offer yet another option in the runasync statement, for use with futures (like promises in JS). FWIW, Lucee had runasync first, and the two implementations are in fact quite different from each other.

Ben has indeed written about the new CF implementation of it in the past here. To find them, you can use this google search:

site:bennadel.com runasync

(Ben, am I missing it if you offer your own means to search the blog?)

Also, not to get too off topic, but while I'm mentioning them, a few of those posts talked about problems you experienced with the new feature in CF2018. Have you by any chance had an opportunity to check if they're still a problem with the 4 updates to CF2018 since its release? Since not all your code can run standalone I haven't tried testing them all to find out myself. I'm simply asking if you may have thought to. You may even say you have a comment in one of the many posts that addresses that.

(For example, the code here relies on a datasource call testing that's not provided, so one would need to create some simulated data. That's not a complaint. What you offer is gold, especially the examples. I'm simply explaining to readers who may have wanted to ask me, "why don't you test each of them?")

Hope that's helpful. Thanks as always, Ben.

Reply to this Comment

Ben, good stuff. I've been reading your posts for years. Do you jump between different coding software like CFBuilder depending on the task or do you have one that you use exclusively?
Tim

Reply to this Comment

Ben. Great to see you back in the CF sphere, although, I have been enjoying your Angular posts. Unfortunately, I'm still on Lucee 4.5. Too scared to make the jump! However, I am going to spin up CommandBox, to have a look at these examples.

Anyway, I was quite interested in the use of:

SLEEP(1)

Is this a way of simulating:

<cfthread action="sleep" duration="">

Inside an SQL query.

Never seen this before!

Reply to this Comment

@All,

As a quick follow-up, I wanted to look at how this parallel iteration can be used to create a more generic approach to parallel query access (or other kinds of high-latency tasks):

https://www.bennadel.com/blog/3646-experimenting-with-the-developer-ergonomics-of-data-access-using-parallel-struct-iteration-in-lucee-5-3-2-77.htm

I ended up going with Stuct-iteration as it allows for parallel tasks to be clearly named and identified.

Reply to this Comment

@Charlie,

Oh man, I had no idea that Lucee had runAsync()!!! I am so far behind the times on this stuff. This is what happens when you're on ColdFusion 10 for years after it has dropped Support :grimace:. I'll have to try out Lucee's implementation.

As far as Adobe's implementation, I know that I tried some stuff in the first or second Update, and the problems seems to persist. I don't think I've tried in the last one or two updates, though -- I will certainly have to take a look at that as well (though, after I explore Lucee's version since that is what we use in Production now).

Re: running things with a database, yeah, that's kind of a bummer. In the post I just linked to in my previous comment, I am just creating inline data-structures rather than running queries - makes it much easier to run stand-alone. I'm trying to be better about that stuff.

Re: Search .... yeah, I don't have any built-in search. I actually my own site using the same exactly method you mentioned, using site: in Google:

https://www.bennadel.com/blog/3631-adding-my-blog-as-a-custom-search-engine-in-google-chrome-s-omnibox.htm

^^ I actually just recently wrote about how I am using Google's omnibox search engines to make it easier for me to search my own stuff. I should probably just create a Search feature that more-or-less proxies this.

Reply to this Comment

@Charles,

Yep, that's exactly correct. The SLEEP() method in MySQL is more or less like the sleep() method in ColdFusion. The main difference is that the method in MySQL uses seconds as its unit of measurement while ColdFusion uses milliseconds as its unit of measurement. So, the following are equivalent:

  • ColdFusion: sleep( 1000 )
  • MySQL: SLEEP( 1 )

Note that my user of uppercase / lowercase makes no difference; that's just a personal style I happen to use, upper-casing SQL functions.

Reply to this Comment

@Tim,

Glad you're enjoying this stuff :D As far as IDE, I pretty much just use SublimeText3 for everything. Generally speaking, I'm not great at leveraging IDE features. Mostly, I just use the color-coding / syntax-highlighting :D Then, the rest I just brute-force.

It's not a perfect approach; but, it gets the job done.

Reply to this Comment

@All,

So, for what it's worth, I just sat down a played around with runAsync() in Lucee 5.3.2.77, and it seems to be just as quirky as Adobe's version. But, in slightly different ways. Neither seem to really follow a Promise-oriented asynchronous workflow; but, when I copy/paste my old demos into a Lucee context, they behave slightly differently.

I think my expectations for runAsync() are just not correct. Or, not appropriate; or, whatever. I think that's why I am so excited about the async collection iteration - it's so simple that the expectation matches that implementation.

Reply to this Comment

Thanks for the additional info, Ben. And yep, when the feature was added, there was indeed discussion from those having experience with futures and promises, about what was missing or not the same. I didn't have a horse in the race (and still don't), so I did not pay close attention.

And I don't know if anyone did then or has since filed bug reports (for cf) or github issues (for Lucee), whether on those matters or the others you'd found and blogged about regarding cf2018's initial implementation.

But if you may do so while you're into it again, it may help future users. Of course, before filing CF bugs, it would be best to check against update 4. If they remain, you could just point to your blog posts for more detail, to make it easier for you.

All that said, I'd understand if you're too busy and would instead choose to let your posts stand as the research done. Perhaps someone else who is excited about the feature set would step up to do all that for you.

Again, thanks as always for all you do.

Reply to this Comment

@Charles,

The idea of things being buggy is "funny". In that, if Lucee works mostly the way that Adobe works, I am not sure what is or is not a bug. I will say that Adobe does let you create an "Empty Future", and Lucee does not appear to offer that as an option (the runAsync() function errors if you don't pass it a function-argument). Also, if Lucee was the first engine to implement runAsync(), then its not even clear where the source of truth is.

That said, I'm happy to file bugs where it seems appropriate as I find issues. I'm never quite clear on how to do that for Lucee. The Adobe bug-base is easy to find; but, the Lucee landscape seems to track bugs in several different places; and, I'm never quite sure where to do the right thing.

Reply to this Comment

Ugh, my browser somehow filled in Charles for my name--confusing when there is indeed Charles R in this thread also. Sorry, folks.

And Ben, as for bugs I didn't mean so much about differences between cf and Lucee, as a) the issues you found and blogged about Lady year (about CF) and b) the issues about either compare to your expectations coming from JS, as others could have the same concerns.

While CF and Lucee don't NEED to follow any other language, it does seem that for generic features where more than one popular language already works a given way or one has been known for years to work a certain way, it's reasonable to hope they might model that unless for clearly stated reasons. A bit utopian, I realize. :)

Reply to this Comment

@All,

I wanted to share a fun follow-up. After Charlie pointed out that Lucee also has the runAsync() function, I thought that I could using it, in conjunction with the parallel iteration, to build some Promise-inspired functionality:

https://www.bennadel.com/blog/3647-creating-asyncall-asyncsettled-and-asyncrace-functions-using-runasync-and-parallel-iteration-in-lucee-5-3-2-77.htm

What I was able to do was create an asyncAll() function, modeled on Promise.all(), an asyncSettled() function, modeled on Promise.allSettled(), and an asyncRace() function, modeled on Promise.race().

The Future object is not quite as nice or flexible as the Promise object; but, using the aforementioned building blocks, we can start to create some fairly interesting asynchronous control-flow in a traditionally blocking context (ColdFusion).

Reply to this Comment

Post A Comment

You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
NEW: Some basic markdown formatting is now supported: bold, italic, blockquotes, lists, fenced code-blocks. Read more about markdown syntax »
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.