Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at CFUNITED Express NYC (Apr. 2009) with: Nafisa Sabu

Learning ColdFusion 8: CFThread Part IV - Cross-Page Thread References

By Ben Nadel on
Tags: ColdFusion

Up until now, we have been examining ColdFusion 8's CFThread tag in the context of a single page or in conjunction with a "set it and forget it" scenario. Now, let's take a look at referencing long running threads across page requests. Remember, since the child thread launched by CFThread may outlive the processing time of its parent, we will have the opportunity to reference a thread that was launched by a previous page request.

To play around with this, we are going to modify our previous Flickr.com photo download demo to use some AJAX. Now, instead of just display the "photos are downloading" message to the user on the confirmation page, we are going to output the status of each photo thread as it updates. This means that after the parent page has finished processing (the confirmation page), we are going to be referencing threads launched by a previous page request. Very exciting.

Before I get into the code, I want to take a second to talk about the THREAD scope. In the demo below, we are telling each CFThread-launched thread to both store itself and then remove itself from the APPLICATION.Threads structure. In doing so, you will notice that I am actually duplicating the THREAD scope within each CFThread tag:

  • <!--- Store thread reference. --->
  • <cfset APPLICATION.Threads[ THREAD.Name ] = Duplicate(
  • THREAD
  • ) />

Calling ColdFusion's Duplicate() here is confusing and weird, but absolutely essential. It has to do with the nature of the THREAD scope. The THREAD scope is a new form of scope that we are not used to dealing with. If you dump out the Java class of the THREAD scope, you will see that it is:

coldfusion.thread.ThreadScope

Now, I don't actually know anything about this scope, but from my experience, most non-setting references to it result in a NULL value. Therefore, in the example above, if we tried to store the THREAD scope directly into the APPLICATION.Threads struct, we would get this in the APPLICATION.Threads struct dump:

undefined struct element

This is because THREAD will store as a NULL value. To demonstrate this without application-level caching, take a look at a VARIABLES-scoped reference to a thread:

  • <!--- Run a thread. --->
  • <cfthread
  • action="run"
  • name="ThreadOne">
  •  
  • <!---
  • We don't need to do anything in this thread,
  • we just need to know that it was launched.
  • --->
  • <cfset THREAD.X = true />
  •  
  • </cfthread>
  •  
  •  
  • <!--- Wait for the thread to finish processing. --->
  • <cfthread
  • action="join"
  • name="ThreadOne"
  • />
  •  
  •  
  • <!--- Output the thread data. --->
  • <cfdump
  • var="#VARIABLES.ThreadOne#"
  • />

When we try to run that page, we get the ColdFusion error:

Element THREADONE is undefined in VARIABLES.

The thread, ThreadOne should be available in the VARIABLES scope (since ThreadOne is available without scoping). If we CFDump out the VARIABLES scope, we get a crazy looking user defined function that is called:

_CFFUNCCFTHREAD_CFTEST22ECFM5059090411

What the hell is that? I'll tell you what it is - it's a clear demonstration that the THREAD scope is a very special beast.

So, going back to the first example above, when we duplicate the THREAD scope, we are actually converting the THREAD scope into a standard struct representation of its meta data. This will give us an object with a familiar java type:

coldfusion.runtime.Struct

Doing this, we can now pass around a copy of the thread data that we can actually reference. But this does not mean we have access to the thread itself - just that we have a copy of its meta data.

That being said, let's get back to the demo at hand. It has two parts: the photo download page and then a page that will grab the cached thread data structs and return their status. Here is our modified photo download page:

  • <!--- Kill extra output. --->
  • <cfsilent>
  •  
  • <!---
  • Param the FORM variable that will hold our photo urls.
  • Remember, each URL is on its own line (separrated by
  • line returns).
  • --->
  • <cfparam
  • name="FORM.photo_url"
  • type="string"
  • default=""
  • />
  •  
  •  
  • <!--- Trim the form field. --->
  • <cfset FORM.photo_url = FORM.photo_url.Trim() />
  •  
  •  
  • <!---
  • Check to see if the form has been submitted. For
  • this demo, we will know this if there is a value
  • in the form field.
  • --->
  • <cfif Len( FORM.photo_url )>
  •  
  • <!---
  • Loop over the URLs. We can treat the text area
  • as if it were a list of URLs that is using the
  • line break, line return as the list delimiter.
  • --->
  • <cfloop
  • index="strURL"
  • list="#FORM.photo_url#"
  • delimiters="#Chr( 13 )##Chr( 10 )#">
  •  
  • <!---
  • Now that we have our individual URL, let's
  • grab the photo binary using CFHttp and store
  • it directly into a file on the server.
  •  
  • We are going to launch this CFHttp call in a
  • new thread using CFThread. We are not going
  • to wait for this call to finish.
  • --->
  • <cfthread
  • action="run"
  • name="photo_#GetFileFromPath( strURL )#">
  •  
  • <!---
  • Let the current thread save itself to the
  • threads struct in the application. While
  • there is a possible race condition here
  • (since other parts of the code will be
  • reading from this struct), for this demo it
  • is not going to be an issue.
  •  
  • Also, notice that we are not passing the
  • THREAD scope itself. This is crutial as the
  • THREAD scope is a very special scope. By
  • duplicating it we are turning it into a
  • ColdFusion runtime struct (much better for
  • our purposes).
  • --->
  • <cfset APPLICATION.Threads[ THREAD.Name ] = Duplicate(
  • THREAD
  • ) />
  •  
  •  
  • <!--- Save the photo. --->
  • <cfhttp
  • url="#strURL#"
  • method="GET"
  • getasbinary="yes"
  • path="#ExpandPath( './data/' )#"
  • file="#GetFileFromPath( strURL )#"
  • />
  •  
  •  
  • <!---
  • Now that the thread has finished running,
  • have this thread remove itself from the
  • application threads struct.
  • --->
  • <cfset StructDelete(
  • APPLICATION.Threads,
  • THREAD.Name
  • ) />
  •  
  • </cfthread>
  •  
  • </cfloop>
  •  
  • </cfif>
  •  
  • </cfsilent>
  •  
  • <cfoutput>
  •  
  • <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  • <html>
  • <head>
  • <title>ColdFusion 8 - CFThread Demo</title>
  •  
  • <!--- Include the jQuery scripts. --->
  • <script
  • type="text/javascript"
  • src="jquery-latest.pack.js">
  • </script>
  • <script type="text/javascript">
  •  
  • // Load the thread activity via jQuery's
  • // AJAX functionality. This will load the
  • // returned value into the innerHTML.
  • function UpdateThreadActivity(){
  •  
  • $( "##threadactivity" ).load(
  • "./get_threads.cfm",
  • {},
  • function(){
  • setTimeout(
  • UpdateThreadActivity,
  • 250
  • );
  • }
  • );
  •  
  • }
  •  
  •  
  • // When the document has loaded, start
  • // updating the thread activity.
  • $( UpdateThreadActivity );
  •  
  • </script>
  • </head>
  • <body>
  •  
  • <h2>
  • Flickr.com Photo Download
  • </h2>
  •  
  •  
  • <!---
  • Check to see if the form as been submitted. For
  • this demo, we will know this if there is a value
  • in the form field.
  • --->
  • <cfif NOT Len( FORM.photo_url )>
  •  
  • <p>
  • Please enter photo URLs that you would like to
  • download. Each URL should be on a single line of
  • the following text area.
  • </p>
  •  
  • <form
  • action="#CGI.script_name#"
  • method="post">
  •  
  • <p>
  • <textarea
  • name="photo_url"
  • cols="70"
  • rows="20"
  • >#FORM.photo_url#</textarea>
  • </p>
  •  
  • <p>
  • <input type="submit" value="Download Now" />
  • </p>
  •  
  • </form>
  •  
  • <cfelse>
  •  
  • <p>
  • Your photos are being downloaded right now:
  • </p>
  •  
  • <!---
  • This part will be updated via jQuery and
  • some nice AJAX requests (see HTML head).
  • --->
  • <p id="threadactivity">
  • Gathering thread activity...
  • </p>
  •  
  • </cfif>
  •  
  • </body>
  • </html>
  •  
  • </cfoutput>

Notice that each CFThread body starts out by caching itself (by name) in the APPLICATION.Threads struct. Then, as it finishes processing, it removes itself (by name) from the same struct. Technically, this is a place where we might be concerned about race conditions, but for this demo, it will not matter. Also notice in place of the "photos are downloading" message, we now have a P tag that is being updated using jQuery and some simple innerHTML-oriented AJAX.

The page that gets called by the AJAX simply iterates over the APPLICATION.Threads meta data structs and outputs the thread data (to be consumed as innerHTML):

  • <!--- Kill extra output. --->
  • <cfsilent>
  •  
  • <!---
  • We are going to build up the thread activity HTML.
  • While I normally would return JSON data here, in
  • order to keep the demo as simple as possible (and
  • since AJAX is not the primary goal here), I am just
  • going to render the innerHTML.
  • --->
  • <cfsavecontent variable="strThreadData">
  • <cfoutput>
  •  
  •  
  • <!--- Check to see if there are any threads. --->
  • <cfif StructCount( APPLICATION.Threads )>
  •  
  • <!--- Loop over the active threads. --->
  • <cfloop
  • item="strName"
  • collection="#APPLICATION.Threads#">
  •  
  • <!---
  • Get a short hand reference to the thread.
  • These threads are going to be removing
  • themsleves from the application, so in
  • order to minimize bad data references, get
  • the thread reference.
  •  
  • Once we have an independent reference to
  • the thread, it won't matter if it has been
  • removed from the APPLICATION scope.
  • --->
  • <cfset objThread = APPLICATION.Threads[ strName ] />
  •  
  • <!--- Output the thread data. --->
  • <strong>#objThread.Name#</strong><br />
  •  
  • .....
  •  
  • Start:
  • #TimeFormat(
  • objThread.StartTime,
  • "h:mm TT"
  • )#<br />
  •  
  • .....
  •  
  • Duration:
  • #NumberFormat(
  • ((Now() - objThread.StartTime) * 86400),
  • "0"
  • )#
  • seconds<br />
  •  
  • </cfloop>
  •  
  • <cfelse>
  •  
  • <em>There are no active threads.</em>
  •  
  • </cfif>
  •  
  • </cfoutput>
  • </cfsavecontent>
  •  
  •  
  • <!--- Output the thread innerHTML. --->
  • <cfcontent
  • type="text/html"
  • variable="#ToBinary( ToBase64( strThreadData ) )#"
  • />
  •  
  • </cfsilent>

Again, this is a place where we would have to consider race conditions (since we are iterating over a structure that is being modified by parallel threads), but for this demo, I am not going to worry about it. In order to minimize that chance of bad references, I get a short-hand pointer to the thread meta data struct (rather than referencing it throught the APPLICATION.Threads struct). Therefore, even if the struct does get removed, we will still have a valid pointer to it.

Running the above code to download the following photos:

  • http://farm2.static.flickr.com/1173/527940862_96ba93b86f_o.jpg
  • http://farm2.static.flickr.com/1130/527500500_3c23ffa31f_o.jpg
  • http://farm2.static.flickr.com/1139/525342690_64bfca4370_o.jpg
  • http://farm1.static.flickr.com/102/301096227_7dccd6ab2d_b.jpg
  • http://farm1.static.flickr.com/186/463620314_01395d9ac3_b.jpg
  • http://farm1.static.flickr.com/198/508846122_18632f9acf_o.jpg
  • http://farm1.static.flickr.com/192/482937551_7cbb42872e_b.jpg

... we get the following output:

Your photos are being downloaded right now:

PHOTO_527940862_96BA93B86F_O.JPG
..... Start: 6:30 PM
..... Duration: 1 seconds
PHOTO_508846122_18632F9ACF_O.JPG
..... Start: 6:30 PM
..... Duration: 1 seconds
PHOTO_482937551_7CBB42872E_B.JPG
..... Start: 6:30 PM
..... Duration: 1 seconds

... and then a split second later, after an AJAX update, we get the following update:

Your photos are being downloaded right now:

PHOTO_508846122_18632F9ACF_O.JPG
..... Start: 6:30 PM
..... Duration: 2 seconds

With each successive AJAX call, more of the threads are removing themselves from the APPLICATION.Threads struct. Pretty snazzy, eh? Ok, so we are not technically referencing threads across different page requests, but based on my THREAD-referencing experiments, it seems that this might be the best way to go. Of course, I am just learning here, so it might be that these threads are accessible by name through some other way (but I do not see anything about it in the documentation). At the very least, since we are allowing threads to update their own meta data references, we are tying the meta data copy to the thread across pages.

When it comes to the data in the APPLICATION.Threads struct, remember that it is a duplicate of the thread meta data - it is not actually the thread meta data as it is contained in the running thread. This means that the thread will not update this data as it processes (ie. the Status attribute will never get updated automatically). But, for our purposes, and I assume most purposes, knowing the name and the StartTime will be sufficient.




Reader Comments

So, since the application struct is only copy of the actual threads at the time of their creation I assume you wouldn't be able to view error informtion, or terminate the threads, etc. Is there no way to reference the thread directly after it has been kicked off and the request which created it has finished?

Reply to this Comment

Ben,
Thread scopes are not kept in the variable scope and hence you don't see it when you dump the variable scope.
I know you must have figured it out but I am just re-iterating. We don't really recommend sharing the thread scopes across request as this can lead to thread safety of the data.
Rupesh.

Reply to this Comment

@Brad,

Unless the running thread itself updates that info, yes, it will not be available since the cached struct is only a copy.

@Ray,

Are you talking about using Evaluate() on a different thread?

@Rupesh,

I know that when you refer to a unscoped variable, ColdFusion will start searching for it in an orderly fashion through many different scopes (ex. query output, function local, arguments, page variables, etc.) From that I would gather that threads are stored in some scope that is being searched. Is that the case? Or are threads a totally new beast that is a very special implementation?

As for "Best practices", I agree, cross-page references are going to get very hairy, very fast. I would not recommend trying to do it. In this case, however, since I am only ever referencing a copy of the meta data, I feel that it is not as bad as it might sound. Of course, if the thread crashes and cannot remove itself from the app scope, clearly this can become very corrupt, very fast :)

I think it is definitely playing with fire no matter how you look at it, but I think if done correctly / carefully it could have some interesting potential.

Reply to this Comment

The UDF in the variables scope is the actual function that is run in the thread i.e. the code between the cfthread and /cfthread tags are turned in to a UDF and executed in a thread.

Reply to this Comment

Thats correct. Threads are stored in a special scope which is actually a request level scope and is searched when you acess any unscoped variable.

Reply to this Comment

That is good to know. I figure I wouldn't use this stuff like this too often. I am comfortable with a thread setting an application-level variable and then destroying it before it finished executing. Of course, if anything in the thread broke then you would have rogue variables that never get deleted. Clearly, not a fabulous idea, but a cool experiment.

Reply to this Comment

I am running in a shared environment and need to use the thread in order to update a collection, due to the time out restraints. I would like to be able to access the status of the thread so that I know if it completed. In this, very cool, example I never see the final status. It may be naive but how can I wait that last second to see the final status?

Reply to this Comment

@Kevin,

I am sorry, I am not sure what you are question is exactly? What are you trying to do with the collection?

Reply to this Comment

I am refreshing the collection after adding new files.
The question is about receiving a status of "Completed" from the thread, or why I am not. I included the status as a part of the information retrieved by the get_Threads.cfm page. Everything works fine but I never receive a status of "completed" just that it is "running". I am sure I am missing something here. Perhaps the reference is killed before I get the "completed" status.

Reply to this Comment

@Kevin,

In my example, after the thread finishes executing, it deletes its name from the Application-cache; as such, you will never see that it finishes. It is either running OR it no longer exists (which implies that it has finished).

Reply to this Comment

Right, I just wanted that warm and fuzzy from it saying it finished normally. I do understand. I will just try to modify it to get the information. Thanks. This site rules by the way. I like all the great information.

Reply to this Comment

hey Ben,
I just tried this example (CF8.01) and although it works, it only ever downloads the last file requested. All the threads fire off and seemingly
complete" but the only file actually written out is the last one in any list.

Anyone else see this?

Reply to this Comment

@JQ,

Threads might be throwing errors (which won't be apparent in the page unless you wait for the threads to re-join). Try checking your logs or JOIN the threads and look at the thread output.

Reply to this Comment

@Ben:

<cfset APPLICATION.Threads[ THREAD.Name ] = Duplicate(
THREAD
) />

since you stated that this COPIES the threads meta data does it mean that you will only get the threads last current status since it is a copy or a snapshot or is it a reference which points to the threads instantaneous current meta data which when the status is updated from WAITING TO COMPLETED will be reflected within the thread status?

Reply to this Comment

@Erwin,

The Duplicate() should return a totally unique structure; as such, it is static, or rather, not automatically updated by the thread. So yes, it is the status of the thread at the time it was called.

Reply to this Comment

@Ben Nadel,

so How do I ensure that I catch status the thread on a status that is either COMPLETED or TERMINATED?

Reply to this Comment

@Erwin,

In this example, you don't really. If you look a few lines below the duplicate() call, you'll see that the Thread itself is deleting its own entry from the Application scope. So, the thread both stores and then deletes its own reference. It self-cleans.

Reply to this Comment

@Ben Nadel,

So while within a thread that is being spawned can I check for THREAD.status eq "COMPLETED" or "TERMINATED" in other other words when does the status gets changed from "RUNNING" ? Is it before the thread is killed or after? Cause I can not call

<cfset APPLICATION.Threads[ THREAD.Name ] = Duplicate(THREAD) /> after?

Reply to this Comment

@Erwin,

The struct in the Application scope won't change. The only thing I am checking here is the existence vs. non-existence (indicating thread started vs. thread ended). I use the existence as the most meaningful property.

What are you trying to do specifically? Perhaps I can come up with a different demo that is more usable for a particular problem space?

Reply to this Comment

@Ben Nadel,

I have one thread which we call Susie and another thread which we will call Lisa. I have an application variable like your example which will be order basket. Susie processes a lot of phone orders (request) and places the finish orders and places the order forms in the order basket but I don't want Susie to place an incomplete orders or have Lisa start on a specific order for delivery until Susie is is complete with a specific order. But while Lisa is fulfilling the orders and delivering Susie likes to pile them on top. Oh and sorry Ben none of them are kinky as they are all about business...lol. SO what is the best approach?

Reply to this Comment

@Ben Nadel,

<cfargument name="wsXmlString" required="Yes" type="string">
<cfargument name="wsfunctionName" required="Yes" type="string">

<cfset variables.wsMemid = 0>
<cfset variables.strlogoutcall = 0>
<cfset variables.wsfunctionName = arguments.wsfunctionName>
<cfset variables.wsXmlString = arguments.wsXmlString>
<!--- 1. Log xml string in DB--->
<cfset variables.strlogoutcall = logonesiteOutcall(variables.wsMemid,variables.wsfunctionName,variables.wsXmlString,"RemoteCall" )>

<!--- 2. This DB returns a Query object, and i used a QuerytoStruct function to get the returned value --->
<cfset variables.strlogoutcall = QueryToStruct(variables.strlogoutcall)>
<cfset variables.strlogoutcall = variables.strlogoutcall[1].onesitecalllogid>

<!--- 3. Now i convert the huge xml to s structure so i can easyly work with it --->
<cfset getstr = ConvertXmlToStruct(trim(arguments.wsXmlString),structnew())>

<!--- 4. No i dynamically call another function that processes data based on the funcion name that is called--->
<cfset processvar = "Process"& variables.wsfunctionName >
<cfset cprocess = evaluate("#processvar#(getstr,variables.strlogoutcall)")>

<!---step 5. Return a logid from step 2 back to the caller --->
<cfreturn variables.strlogoutcall>

</cftry>
</cffunction>


This is what i am trying to do.

step 1 . When the function is called remotely, I log the call into the DB
Steps2, I get the return from the DB and get the identifier that was returned from the DB.
Step 3, I convert the huge XML that was sent to this function to a struct
Step 4. I pass the xml structure that was returned from 3 above to a dynamic function to go process the XMl etc . (This process takes btw 4-8 seconds)
step 5. I return the logid that was returned in step 2 back to the caller.


what i want to achieve is i want the process to run from step 1 - 5 , but i don't want the initial thread
from this to wait for step 4 to finish before return the log id in step 5.

So i was thinking of wraping a cfthread around step 4, so that the initial thread that was spawned into
this function does not wait on the new cfthread that spawned for step4.

So to the initial thread it looks like the call procession is 1,2,3,5 even though step 4 will be called, but a new thread will be spawned

Do you get my drift?

Reply to this Comment

@Ben Nadel,

Sorry I forgot to paste this at the beginning:

<cffunction name="ProcessAll" returntype="numeric" access="remote" output = "Yes" hint="This is used to ProcessUpdate Executive Specific Profile">

Reply to this Comment

@ErwinB,

I definitely get your drift. I think that makes total sense - putting CFThread around step 4. Are you have success with this approach? Sorry I wasn't sure if this was related to the comments before about the order basket.

Reply to this Comment

Thanks for this - it seems like a bit of a hack to store the data in the application scope, but damn if it doesn't work a charm.
I had a massive timing issue with posting data to a payment gateway and this technique solved the problem.

The thing is, I can't see why this functionality isn't built into cfthread... I'd expect there to be an explicit thread scope that can be accessed from any page to check if threads are running. Seems like quite an oversight really.
Ahh well, this'll do for now... thanks again.

Reply to this Comment

@Gary,

It's an interesting idea to have some sort of built-in, centralized thread aggregation. I could easily see something like CF9's cache methods in the thread area. Sorry, that doesn't make any sense - what I mean is that you can get the list of all cache IDs in CF9 - cacheGetAllIDs(). Would be cool to have something like that for threads:

threadGetAllRunning()

... that would return an array of all still-running threads.

Reply to this Comment

@Raymond,

CFThread is definitely good when you're on the same page. Just doing some brain-storming about a way to reference threads generated on different pages. I think it's a very outlier type use-case; but, it could be useful.

Reply to this Comment

Hi Ben great article and yes this is several years late but hey that's the timeless beauty of the internet ;)

I'm also running into a situation currently where it would be great to have some kind of THREAD scope i could use to access all running threads and then reuse them.

I've got a system based heavily on 3rd party XML synchronisations of traveller profiles, and there can be up to 10 concurrent threads running all sending Profiles to the external system at a rate of around 120 per minute (10 x 120 = total throughput of profiles per minute). These syncs can be fired from either a scheduled process that runs every 15 mins, or by database triggers or direct page requests within the CF app. It would be great to leave these 10 Named threads always open and dedicated to the profile syncs to optimise resources, however I can't see anyway to do that currently due to page requests coming from all over the system. Currently i just fire off new threads for each sync no matter where it happens in the system which can cause up to 30-40 threads to be created in peak times.

I might try and play around with your Application struct example Ben and see if I can't manage a set of 10 named threads and try and sleep them upon completion of a given task and reuse them when needed.

Reply to this Comment

@JQ (Feb 15, 2010),

That was an old comment of yours, but in case you are still wondering I saw this, too, also with CF8.01 (yes, I'm still stuck with that version!). I commented on the same issue in Ben's previous post of the series (Learning Coldfusion 8: CFThread Part III - Set It And Forget It).

The problem is that all threads use the same strURL variable from the parent code, but this gets updated during the loop that creates the threads themselves, so that in fact by the time the threads actually run they all see the last value of it, download the picture from the same URL and give it the same filename.

The solution is to pass the url as a cfthread attribute, like in:

cfthread (...) myUrl="#strURL#"

and use this attribute inside the thread:

cfhttp url="#attributes.myURL#" (...) file="#GetFileFromPath( attributes.myURL )#"

This way you are saving and using the value strURL had at the moment the thread was created, and not the one it has when it runs.

Reply to this Comment

Post A Comment

You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.