Learning ColdFusion 8: CFThread Part IV - Cross-Page Thread References

By Ben Nadel

Published 2007-06-04 in ColdFusion — Comments (34)

Up until now, we have been examining ColdFusion 8's CFThread tag in the context of a single page or in conjunction with a "set it and forget it" scenario. Now, let's take a look at referencing long running threads across page requests. Remember, since the child thread launched by CFThread may outlive the processing time of its parent, we will have the opportunity to reference a thread that was launched by a previous page request.

To play around with this, we are going to modify our previous photo download demo to use some AJAX. Now, instead of just display the "photos are downloading" message to the user on the confirmation page, we are going to output the status of each photo thread as it updates. This means that after the parent page has finished processing (the confirmation page), we are going to be referencing threads launched by a previous page request. Very exciting.

Before I get into the code, I want to take a second to talk about the THREAD scope. In the demo below, we are telling each CFThread-launched thread to both store itself and then remove itself from the APPLICATION.Threads structure. In doing so, you will notice that I am actually duplicating the THREAD scope within each CFThread tag:

<!--- Store thread reference. --->
<cfset APPLICATION.Threads[ THREAD.Name ] = Duplicate(
	THREAD
	) />

Calling ColdFusion's Duplicate() here is confusing and weird, but absolutely essential. It has to do with the nature of the THREAD scope. The THREAD scope is a new form of scope that we are not used to dealing with. If you dump out the Java class of the THREAD scope, you will see that it is:

coldfusion.thread.ThreadScope

Now, I don't actually know anything about this scope, but from my experience, most non-setting references to it result in a NULL value. Therefore, in the example above, if we tried to store the THREAD scope directly into the APPLICATION.Threads struct, we would get this in the APPLICATION.Threads struct dump:

undefined struct element

This is because THREAD will store as a NULL value. To demonstrate this without application-level caching, take a look at a VARIABLES-scoped reference to a thread:

<!--- Run a thread. --->
<cfthread
	action="run"
	name="ThreadOne">

	<!---
		We don't need to do anything in this thread,
		we just need to know that it was launched.
	--->
	<cfset THREAD.X = true />

</cfthread>


<!--- Wait for the thread to finish processing. --->
<cfthread
	action="join"
	name="ThreadOne"
	/>


<!--- Output the thread data. --->
<cfdump
	var="#VARIABLES.ThreadOne#"
	/>

When we try to run that page, we get the ColdFusion error:

Element THREADONE is undefined in VARIABLES.

The thread, ThreadOne should be available in the VARIABLES scope (since ThreadOne is available without scoping). If we CFDump out the VARIABLES scope, we get a crazy looking user defined function that is called:

_CFFUNCCFTHREAD_CFTEST22ECFM5059090411

What the hell is that? I'll tell you what it is - it's a clear demonstration that the THREAD scope is a very special beast.

So, going back to the first example above, when we duplicate the THREAD scope, we are actually converting the THREAD scope into a standard struct representation of its meta data. This will give us an object with a familiar java type:

coldfusion.runtime.Struct

Doing this, we can now pass around a copy of the thread data that we can actually reference. But this does not mean we have access to the thread itself - just that we have a copy of its meta data.

That being said, let's get back to the demo at hand. It has two parts: the photo download page and then a page that will grab the cached thread data structs and return their status. Here is our modified photo download page:

<!--- Kill extra output. --->
<cfsilent>

	<!---
		Param the FORM variable that will hold our photo urls.
		Remember, each URL is on its own line (separrated by
		line returns).
	--->
	<cfparam
		name="FORM.photo_url"
		type="string"
		default=""
		/>


	<!--- Trim the form field. --->
	<cfset FORM.photo_url = FORM.photo_url.Trim() />


	<!---
		Check to see if the form has been submitted. For
		this demo, we will know this if there is a value
		in the form field.
	--->
	<cfif Len( FORM.photo_url )>

		<!---
			Loop over the URLs. We can treat the text area
			as if it were a list of URLs that is using the
			line break, line return as the list delimiter.
		--->
		<cfloop
			index="strURL"
			list="#FORM.photo_url#"
			delimiters="#Chr( 13 )##Chr( 10 )#">

			<!---
				Now that we have our individual URL, let's
				grab the photo binary using CFHttp and store
				it directly into a file on the server.

				We are going to launch this CFHttp call in a
				new thread using CFThread. We are not going
				to wait for this call to finish.
			--->
			<cfthread
				action="run"
				name="photo_#GetFileFromPath( strURL )#">

				<!---
					Let the current thread save itself to the
					threads struct in the application. While
					there is a possible race condition here
					(since other parts of the code will be
					reading from this struct), for this demo it
					is not going to be an issue.

					Also, notice that we are not passing the
					THREAD scope itself. This is crutial as the
					THREAD scope is a very special scope. By
					duplicating it we are turning it into a
					ColdFusion runtime struct (much better for
					our purposes).
				--->
				<cfset APPLICATION.Threads[ THREAD.Name ] = Duplicate(
					THREAD
					) />


				<!--- Save the photo. --->
				<cfhttp
					url="#strURL#"
					method="GET"
					getasbinary="yes"
					path="#ExpandPath( './data/' )#"
					file="#GetFileFromPath( strURL )#"
					/>


				<!---
					Now that the thread has finished running,
					have this thread remove itself from the
					application threads struct.
				--->
				<cfset StructDelete(
					APPLICATION.Threads,
					THREAD.Name
					) />

			</cfthread>

		</cfloop>

	</cfif>

</cfsilent>

<cfoutput>

	<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
	<html>
	<head>
		<title>ColdFusion 8 - CFThread Demo</title>

		<!--- Include the jQuery scripts. --->
		<script
			type="text/javascript"
			src="jquery-latest.pack.js">
		</script>
		<script type="text/javascript">

			// Load the thread activity via jQuery's
			// AJAX functionality. This will load the
			// returned value into the innerHTML.
			function UpdateThreadActivity(){

				$( "##threadactivity" ).load(
					"./get_threads.cfm",
					{},
					function(){
						setTimeout(
							UpdateThreadActivity,
							250
							);
					}
					);

			}


			// When the document has loaded, start
			// updating the thread activity.
			$( UpdateThreadActivity );

		</script>
	</head>
	<body>

		<h2>
			Photo Download
		</h2>


		<!---
			Check to see if the form as been submitted. For
			this demo, we will know this if there is a value
			in the form field.
		--->
		<cfif NOT Len( FORM.photo_url )>

			<p>
				Please enter photo URLs that you would like to
				download. Each URL should be on a single line of
				the following text area.
			</p>

			<form
				action="#CGI.script_name#"
				method="post">

				<p>
					<textarea
						name="photo_url"
						cols="70"
						rows="20"
						>#FORM.photo_url#</textarea>
				</p>

				<p>
					<input type="submit" value="Download Now" />
				</p>

			</form>

		<cfelse>

			<p>
				Your photos are being downloaded right now:
			</p>

			<!---
				This part will be updated via jQuery and
				some nice AJAX requests (see HTML head).
			--->
			<p id="threadactivity">
				Gathering thread activity...
			</p>

		</cfif>

	</body>
	</html>

</cfoutput>

Notice that each CFThread body starts out by caching itself (by name) in the APPLICATION.Threads struct. Then, as it finishes processing, it removes itself (by name) from the same struct. Technically, this is a place where we might be concerned about race conditions, but for this demo, it will not matter. Also notice in place of the "photos are downloading" message, we now have a P tag that is being updated using jQuery and some simple innerHTML-oriented AJAX.

The page that gets called by the AJAX simply iterates over the APPLICATION.Threads meta data structs and outputs the thread data (to be consumed as innerHTML):

<!--- Kill extra output. --->
<cfsilent>

	<!---
		We are going to build up the thread activity HTML.
		While I normally would return JSON data here, in
		order to keep the demo as simple as possible (and
		since AJAX is not the primary goal here), I am just
		going to render the innerHTML.
	--->
	<cfsavecontent variable="strThreadData">
	<cfoutput>


		<!--- Check to see if there are any threads. --->
		<cfif StructCount( APPLICATION.Threads )>

			<!--- Loop over the active threads. --->
			<cfloop
				item="strName"
				collection="#APPLICATION.Threads#">

				<!---
					Get a short hand reference to the thread.
					These threads are going to be removing
					themsleves from the application, so in
					order to minimize bad data references, get
					the thread reference.

					Once we have an independent reference to
					the thread, it won't matter if it has been
					removed from the APPLICATION scope.
				--->
				<cfset objThread = APPLICATION.Threads[ strName ] />

				<!--- Output the thread data. --->
				<strong>#objThread.Name#</strong><br />

				.....

				Start:
				#TimeFormat(
					objThread.StartTime,
					"h:mm TT"
					)#<br />

				.....

				Duration:
				#NumberFormat(
					((Now() - objThread.StartTime) * 86400),
					"0"
					)#
				seconds<br />

			</cfloop>

		<cfelse>

			<em>There are no active threads.</em>

		</cfif>

	</cfoutput>
	</cfsavecontent>


	<!--- Output the thread innerHTML. --->
	<cfcontent
		type="text/html"
		variable="#ToBinary( ToBase64( strThreadData ) )#"
		/>

</cfsilent>

Again, this is a place where we would have to consider race conditions (since we are iterating over a structure that is being modified by parallel threads), but for this demo, I am not going to worry about it. In order to minimize that chance of bad references, I get a short-hand pointer to the thread meta data struct (rather than referencing it throught the APPLICATION.Threads struct). Therefore, even if the struct does get removed, we will still have a valid pointer to it.

Running the above code to download photos, we get the following output:

Your photos are being downloaded right now:

PHOTO_527940862_96BA93B86F_O.JPG
..... Start: 6:30 PM
..... Duration: 1 seconds
PHOTO_508846122_18632F9ACF_O.JPG
..... Start: 6:30 PM
..... Duration: 1 seconds
PHOTO_482937551_7CBB42872E_B.JPG
..... Start: 6:30 PM
..... Duration: 1 seconds

... and then a split second later, after an AJAX update, we get the following update:

Your photos are being downloaded right now:

PHOTO_508846122_18632F9ACF_O.JPG
..... Start: 6:30 PM
..... Duration: 2 seconds

With each successive AJAX call, more of the threads are removing themselves from the APPLICATION.Threads struct. Pretty snazzy, eh? Ok, so we are not technically referencing threads across different page requests, but based on my THREAD-referencing experiments, it seems that this might be the best way to go. Of course, I am just learning here, so it might be that these threads are accessible by name through some other way (but I do not see anything about it in the documentation). At the very least, since we are allowing threads to update their own meta data references, we are tying the meta data copy to the thread across pages.

When it comes to the data in the APPLICATION.Threads struct, remember that it is a duplicate of the thread meta data - it is not actually the thread meta data as it is contained in the running thread. This means that the thread will not update this data as it processes (ie. the Status attribute will never get updated automatically). But, for our purposes, and I assume most purposes, knowing the name and the StartTime will be sufficient.

Want to use code from this post? Check out the license.

Short link: https://bennadel.com/752

Reader Comments

Brad Wood Jun 5, 2007 at 11:01 AM

45 Comments

So, since the application struct is only copy of the actual threads at the time of their creation I assume you wouldn't be able to view error informtion, or terminate the threads, etc. Is there no way to reference the thread directly after it has been kicked off and the request which created it has finished?

Raymond Camden Jun 5, 2007 at 11:03 AM

367 Comments

One thing to note -if you want to get the thread data, you CAN get it via Evaluate(). Thats how Adobe documents it.

Rupesh Kumar Jun 5, 2007 at 11:52 AM

25 Comments

Ben,
Thread scopes are not kept in the variable scope and hence you don't see it when you dump the variable scope.
I know you must have figured it out but I am just re-iterating. We don't really recommend sharing the thread scopes across request as this can lead to thread safety of the data.
Rupesh.

Ben Nadel Jun 5, 2007 at 2:03 PM

16,233 Comments

@Brad,

Unless the running thread itself updates that info, yes, it will not be available since the cached struct is only a copy.

@Ray,

Are you talking about using Evaluate() on a different thread?

@Rupesh,

I know that when you refer to a unscoped variable, ColdFusion will start searching for it in an orderly fashion through many different scopes (ex. query output, function local, arguments, page variables, etc.) From that I would gather that threads are stored in some scope that is being searched. Is that the case? Or are threads a totally new beast that is a very special implementation?

As for "Best practices", I agree, cross-page references are going to get very hairy, very fast. I would not recommend trying to do it. In this case, however, since I am only ever referencing a copy of the meta data, I feel that it is not as bad as it might sound. Of course, if the thread crashes and cannot remove itself from the app scope, clearly this can become very corrupt, very fast :)

I think it is definitely playing with fire no matter how you look at it, but I think if done correctly / carefully it could have some interesting potential.

Ben Nadel Jun 5, 2007 at 2:05 PM

16,233 Comments

@Rupesh,

Plus, what is that UDF in the VARIABLES scope that I am seeing? What does it do?

Tom Jordahl Jun 15, 2007 at 2:28 PM

9 Comments

The UDF in the variables scope is the actual function that is run in the thread i.e. the code between the cfthread and /cfthread tags are turned in to a UDF and executed in a thread.

Rupesh Kumar Jun 15, 2007 at 4:01 PM

25 Comments

Thats correct. Threads are stored in a special scope which is actually a request level scope and is searched when you acess any unscoped variable.

Ben Nadel Jun 15, 2007 at 6:04 PM

16,233 Comments

That is good to know. I figure I wouldn't use this stuff like this too often. I am comfortable with a thread setting an application-level variable and then destroying it before it finished executing. Of course, if anything in the thread broke then you would have rogue variables that never get deleted. Clearly, not a fabulous idea, but a cool experiment.

Keivn Nov 10, 2009 at 7:33 AM

19 Comments

I am running in a shared environment and need to use the thread in order to update a collection, due to the time out restraints. I would like to be able to access the status of the thread so that I know if it completed. In this, very cool, example I never see the final status. It may be naive but how can I wait that last second to see the final status?

Ben Nadel Nov 15, 2009 at 10:55 PM

16,233 Comments

@Kevin,

I am sorry, I am not sure what you are question is exactly? What are you trying to do with the collection?

Keivn Nov 16, 2009 at 3:03 PM

19 Comments

I am refreshing the collection after adding new files.
The question is about receiving a status of "Completed" from the thread, or why I am not. I included the status as a part of the information retrieved by the get_Threads.cfm page. Everything works fine but I never receive a status of "completed" just that it is "running". I am sure I am missing something here. Perhaps the reference is killed before I get the "completed" status.

Ben Nadel Nov 16, 2009 at 3:12 PM

16,233 Comments

@Kevin,

In my example, after the thread finishes executing, it deletes its name from the Application-cache; as such, you will never see that it finishes. It is either running OR it no longer exists (which implies that it has finished).

Keivn Nov 16, 2009 at 3:41 PM

19 Comments

Right, I just wanted that warm and fuzzy from it saying it finished normally. I do understand. I will just try to modify it to get the information. Thanks. This site rules by the way. I like all the great information.

Ben Nadel Nov 16, 2009 at 3:42 PM

16,233 Comments

@Kevin,

Thanks :) That is much appreciated.

JQ Feb 15, 2010 at 7:19 PM

1 Comments

hey Ben,
I just tried this example (CF8.01) and although it works, it only ever downloads the last file requested. All the threads fire off and seemingly
complete" but the only file actually written out is the last one in any list.

Anyone else see this?

Ben Nadel Feb 17, 2010 at 11:00 PM

16,233 Comments

@JQ,

Threads might be throwing errors (which won't be apparent in the page unless you wait for the threads to re-join). Try checking your logs or JOIN the threads and look at the thread output.

Erwin Mar 26, 2010 at 10:52 AM

8 Comments

@Ben:

since you stated that this COPIES the threads meta data does it mean that you will only get the threads last current status since it is a copy or a snapshot or is it a reference which points to the threads instantaneous current meta data which when the status is updated from WAITING TO COMPLETED will be reflected within the thread status?

Ben Nadel Mar 26, 2010 at 1:54 PM

16,233 Comments

@Erwin,

The Duplicate() should return a totally unique structure; as such, it is static, or rather, not automatically updated by the thread. So yes, it is the status of the thread at the time it was called.

Erwin Mar 26, 2010 at 2:02 PM

8 Comments

@Ben Nadel,

so How do I ensure that I catch status the thread on a status that is either COMPLETED or TERMINATED?

Ben Nadel Mar 26, 2010 at 2:05 PM

16,233 Comments

@Erwin,

In this example, you don't really. If you look a few lines below the duplicate() call, you'll see that the Thread itself is deleting its own entry from the Application scope. So, the thread both stores and then deletes its own reference. It self-cleans.

Erwin Mar 26, 2010 at 2:43 PM

8 Comments

@Ben Nadel,

So while within a thread that is being spawned can I check for THREAD.status eq "COMPLETED" or "TERMINATED" in other other words when does the status gets changed from "RUNNING" ? Is it before the thread is killed or after? Cause I can not call

<cfset APPLICATION.Threads[ THREAD.Name ] = Duplicate(THREAD) /> after?

Ben Nadel Mar 26, 2010 at 2:54 PM

16,233 Comments

@Erwin,

The struct in the Application scope won't change. The only thing I am checking here is the existence vs. non-existence (indicating thread started vs. thread ended). I use the existence as the most meaningful property.

What are you trying to do specifically? Perhaps I can come up with a different demo that is more usable for a particular problem space?

Erwin Mar 26, 2010 at 3:11 PM

8 Comments

@Ben Nadel,

I have one thread which we call Susie and another thread which we will call Lisa. I have an application variable like your example which will be order basket. Susie processes a lot of phone orders (request) and places the finish orders and places the order forms in the order basket but I don't want Susie to place an incomplete orders or have Lisa start on a specific order for delivery until Susie is is complete with a specific order. But while Lisa is fulfilling the orders and delivering Susie likes to pile them on top. Oh and sorry Ben none of them are kinky as they are all about business...lol. SO what is the best approach?

Erwin Mar 26, 2010 at 4:21 PM

8 Comments

@Ben Nadel,

<cfargument name="wsXmlString" required="Yes" type="string">
<cfargument name="wsfunctionName" required="Yes" type="string">

<cfset variables.wsMemid = 0>
<cfset variables.strlogoutcall = 0>
<cfset variables.wsfunctionName = arguments.wsfunctionName>
<cfset variables.wsXmlString = arguments.wsXmlString>

<cfset variables.strlogoutcall = logonesiteOutcall(variables.wsMemid,variables.wsfunctionName,variables.wsXmlString,"RemoteCall" )>


<cfset variables.strlogoutcall = QueryToStruct(variables.strlogoutcall)>
<cfset variables.strlogoutcall = variables.strlogoutcall[1].onesitecalllogid>


<cfset getstr = ConvertXmlToStruct(trim(arguments.wsXmlString),structnew())>


<cfset processvar = "Process"& variables.wsfunctionName >
<cfset cprocess = evaluate("#processvar#(getstr,variables.strlogoutcall)")>


<cfreturn variables.strlogoutcall>

</cftry>
</cffunction>

This is what i am trying to do.

step 1 . When the function is called remotely, I log the call into the DB
Steps2, I get the return from the DB and get the identifier that was returned from the DB.
Step 3, I convert the huge XML that was sent to this function to a struct
Step 4. I pass the xml structure that was returned from 3 above to a dynamic function to go process the XMl etc . (This process takes btw 4-8 seconds)
step 5. I return the logid that was returned in step 2 back to the caller.

what i want to achieve is i want the process to run from step 1 - 5 , but i don't want the initial thread
from this to wait for step 4 to finish before return the log id in step 5.

So i was thinking of wraping a cfthread around step 4, so that the initial thread that was spawned into
this function does not wait on the new cfthread that spawned for step4.

So to the initial thread it looks like the call procession is 1,2,3,5 even though step 4 will be called, but a new thread will be spawned

Do you get my drift?

Erwin Mar 26, 2010 at 4:23 PM

8 Comments

@Ben Nadel,

Sorry I forgot to paste this at the beginning:

Ben Nadel Apr 21, 2010 at 10:00 AM

16,233 Comments

@ErwinB,

I definitely get your drift. I think that makes total sense - putting CFThread around step 4. Are you have success with this approach? Sorry I wasn't sure if this was related to the comments before about the order basket.

Gary Stanton May 13, 2010 at 7:07 AM

16 Comments

Thanks for this - it seems like a bit of a hack to store the data in the application scope, but damn if it doesn't work a charm.
I had a massive timing issue with posting data to a payment gateway and this technique solved the problem.

The thing is, I can't see why this functionality isn't built into cfthread... I'd expect there to be an explicit thread scope that can be accessed from any page to check if threads are running. Seems like quite an oversight really.
Ahh well, this'll do for now... thanks again.

Ben Nadel May 13, 2010 at 6:08 PM

16,233 Comments

@Gary,

It's an interesting idea to have some sort of built-in, centralized thread aggregation. I could easily see something like CF9's cache methods in the thread area. Sorry, that doesn't make any sense - what I mean is that you can get the list of all cache IDs in CF9 - cacheGetAllIDs(). Would be cool to have something like that for threads:

threadGetAllRunning()

... that would return an array of all still-running threads.

Raymond Camden May 13, 2010 at 6:10 PM

367 Comments

You can. Use the cfthread scope. Well, it will return ALL threads, not just the ones running, but you can filter it.

Ben Nadel May 13, 2010 at 6:14 PM

16,233 Comments

@Raymond,

CFThread is definitely good when you're on the same page. Just doing some brain-storming about a way to reference threads generated on different pages. I think it's a very outlier type use-case; but, it could be useful.

Raymond Camden May 13, 2010 at 9:39 PM

367 Comments

Heh, I have to say - I have no idea what the rest of this blog entry is - just saw the comment come in via email. ;)

Ben Nadel May 13, 2010 at 9:42 PM

16,233 Comments

@Raymond,

No worries my friend - an old post that got bumped up again.

Phil Mar 9, 2012 at 2:47 AM

6 Comments

Hi Ben great article and yes this is several years late but hey that's the timeless beauty of the internet ;)

I'm also running into a situation currently where it would be great to have some kind of THREAD scope i could use to access all running threads and then reuse them.

I've got a system based heavily on 3rd party XML synchronisations of traveller profiles, and there can be up to 10 concurrent threads running all sending Profiles to the external system at a rate of around 120 per minute (10 x 120 = total throughput of profiles per minute). These syncs can be fired from either a scheduled process that runs every 15 mins, or by database triggers or direct page requests within the CF app. It would be great to leave these 10 Named threads always open and dedicated to the profile syncs to optimise resources, however I can't see anyway to do that currently due to page requests coming from all over the system. Currently i just fire off new threads for each sync no matter where it happens in the system which can cause up to 30-40 threads to be created in peak times.

I might try and play around with your Application struct example Ben and see if I can't manage a set of 10 named threads and try and sleep them upon completion of a given task and reuse them when needed.

Andrea Mar 26, 2014 at 5:19 AM

4 Comments

@JQ (Feb 15, 2010),

That was an old comment of yours, but in case you are still wondering I saw this, too, also with CF8.01 (yes, I'm still stuck with that version!). I commented on the same issue in Ben's previous post of the series (Learning Coldfusion 8: CFThread Part III - Set It And Forget It).

The problem is that all threads use the same strURL variable from the parent code, but this gets updated during the loop that creates the threads themselves, so that in fact by the time the threads actually run they all see the last value of it, download the picture from the same URL and give it the same filename.

The solution is to pass the url as a cfthread attribute, like in:

cfthread (...) myUrl="#strURL#"

and use this attribute inside the thread:

cfhttp url="#attributes.myURL#" (...) file="#GetFileFromPath( attributes.myURL )#"

This way you are saving and using the value strURL had at the moment the thread was created, and not the one it has when it runs.

Oh my chickens, this post is old!

Hit me up on LinkedIn if you want to discuss it further.