Learning ColdFusion 8: CFThread Part II - Parallel Threads

Posted June 3, 2007 at 8:32 PM

Tags: ColdFusion

Now that we have covered the basics of sending data into and getting data out of ColdFusion 8's new CFThread-launched processing threads, let's examine some places where they can be used. For this post, we are going to concenrate on utilizing CFThread to speed up page processing even if we need all threads to finish processing in the same request.

Often times, while there is a single, overall goal for a page request, that page request is divided up into chunks of code that may be run independently. Take for example grabbing search results from Google.com. Imagine we wanted to grab the first 1,000 search results for his buffness, Vin Diesel, using ColdFusion's CFHttp tag. Google only allows you to grab a max of a 100 results in a single request, so in order to get 1,000 results, it means we have to make 10 sepparate CFHttp requests, each grabbing the next 100 results.

Now, each of those 100 results relates to the overall goal of the page, but does one set of 100 really have anything to do with the next set of 100? Sure, they have to be in some sort of order, but would it even matter which order we made our requests in, so long as the final results were the same?

Absolutely not. But, using traditional ColdFusion code, we have no other option but to make one CFHttp request and then wait for it to finish before making our next CFHttp request. By the very nature of our single-threaded request (well, technically multi-threaded request since CFHttp fires a new process), each CFHttp call is directly tied to the next via processing availability.

This syncronous processing is not as fast as it can be. Take a look at this traditional CFHttp code:

 Launch code in new window » Download code as text file »

  • <!---
  • Build the base URL for the results. This will
  • include everything but the start index. We are
  • going to be screen-scraping Google for some
  • search results.
  • --->
  • <cfset strBaseURL = (
  • "http://www.google.com/search?" &
  • "q=Vin+Diesel" &
  • "&num=100" &
  • "&start="
  • ) />
  •  
  •  
  • <!---
  • Method One: Traditional Syncronous CFHttp calls.
  • This methodology requires that ColdFusion run
  • each CFHttp on its own and then wait for it to
  • finish before firing off the next one.
  • --->
  •  
  •  
  • <!--- Get the starting time. --->
  • <cfset intStartTime = GetTickCount() />
  •  
  • <!---
  • Let's get the first 1000 results for Vin Diesel.
  • In order to do this, we are going to grab 10 sets
  • of 100 results.
  • --->
  • <cfloop
  • index="intGet"
  • from="1"
  • to="10"
  • step="1">
  •  
  • <cfhttp
  • method="GET"
  • url="#strBaseURL##((intGet - 1) * 100)#"
  • useragent="#CGI.http_user_agent#"
  • result="objGet#intGet#"
  • />
  •  
  • </cfloop>
  •  
  •  
  • <!--- Output retrieval times. --->
  • <p>
  • We Got 1000 Results in
  • #NumberFormat(
  • ((GetTickCount() - intStartTime) / 1000),
  • ",.00"
  • )#
  • seconds using standard CFHttp
  • </p>

If we run the above code a few times, we get the output:

We Got 1000 Results in 4.83 seconds using standard CFHttp
We Got 1000 Results in 4.92 seconds using standard CFHttp
We Got 1000 Results in 12.97 seconds using standard CFHttp
We Got 1000 Results in 4.11 seconds using standard CFHttp
We Got 1000 Results in 7.11 seconds using standard CFHttp

As you can see, the total page processing time took anywhere from 4.11 seconds to almost 13 seconds.

Now that ColdFusion 8 has introduced the new CFThread tag, we can break free of our single-threaded mind-set. No longer does independent code have to wait for other parts of the code to finish processing (as in our above example). In this next example, we are going to wrap each CFHttp call inside of its own CFThread tag. This will allow ColdFusion to launch a new, asycronous thread for each 100 results from Google.com:

 Launch code in new window » Download code as text file »

  • <!---
  • Build the base URL for the results. This will
  • include everything but the start index. We are
  • going to be screen-scraping Google for some
  • search results.
  • --->
  • <cfset strBaseURL = (
  • "http://www.google.com/search?" &
  • "q=Vin+Diesel" &
  • "&num=100" &
  • "&start="
  • ) />
  •  
  •  
  • <!---
  • Method Two: Asyncronous parallel thread CFHttp
  • calls. This methodology leverages ColdFusion 8's
  • new CFThread tag to fire parallel CFHttp calls.
  • --->
  •  
  •  
  • <!--- Get the starting time. --->
  • <cfset intStartTime = GetTickCount() />
  •  
  • <!---
  • Let's get the first 1000 results for Vin Diesel.
  • In order to do this, we are going to grab 10 sets
  • of 100 results, but this time each grab is going
  • to be done in it's own thread.
  • --->
  • <cfloop
  • index="intGet"
  • from="1"
  • to="10"
  • step="1">
  •  
  • <!--- Start a new thread for this CFHttp call. --->
  • <cfthread
  • action="run"
  • name="objGet#intGet#">
  •  
  • <cfhttp
  • method="GET"
  • url="#strBaseURL##((intGet - 1) * 100)#"
  • useragent="#CGI.http_user_agent#"
  • result="THREAD.Get#intGet#"
  • />
  •  
  • </cfthread>
  •  
  • </cfloop>
  •  
  •  
  • <!---
  • Now, we have to wait for all of concurrent
  • threads to be joined before we can use the
  • CFHttp results.
  • --->
  • <cfloop
  • index="intGet"
  • from="1"
  • to="10"
  • step="1">
  •  
  • <cfthread
  • action="join"
  • name="objGet#intGet#"
  • />
  •  
  • </cfloop>
  •  
  •  
  • <!--- Output retrieval times. --->
  • <p>
  • We Got 1000 Results in
  • #NumberFormat(
  • ((GetTickCount() - intStartTime) / 1000),
  • ",.00"
  • )#
  • seconds using CFHttp and CFThread
  • </p>

Running the above code a few times, we get the output:

We Got 1000 Results in 0.79 seconds using CFHttp and CFThread
We Got 1000 Results in 0.69 seconds using CFHttp and CFThread
We Got 1000 Results in 0.72 seconds using CFHttp and CFThread
We Got 1000 Results in 0.63 seconds using CFHttp and CFThread
We Got 1000 Results in 3.44 seconds using CFHttp and CFThread

As you can see, the page processing time decreased dramatically - usually less than a second in total. So what's with the 3.44 second entry? ColdFusion 8's new CFThread tag requests that a new thread be launched to handle this code; however, the ColdFusion application server does not have an unlimitted number of threads at its disposal. Each CFThread tag requests a new thread. This thread request is then queued for processing. When a processing thread becomes available, it gets passed to the CFThread code for asyncronous processing (also, in our case, processing time is directly tied to the speed of Google.com to return results).

This is very important to undestand. Running parallel threads will only make your page run faster if parallel threads are available to be launched. If you have a server that is maxed out on page requests, wrapping code in CFThread might not have any affect at all (in that case, it might actually have a negative affect since the current page now has to wait for threads... but that is purely an uneducated hypothesis). However, since computers spend like 90% of their time waiting for user requests (at least that is what I hear about Personal Desktop Computers - probably not the same for web servers), it's more likely than not that running parallel threads using ColdFusion 8's CFThread will lead to dramatic page performance increases.

Also notice that after our CFHttp requests, we are explicitly requesting that the parent page wait for all the parallel threads to finish processing (and to join the page). Since these threads are all running in parallel, there is no guarantee that any one thread will have finished processing by the time the parent page reaches a certain line unless you explicitly wait for a named thread to finish. Mental note!

Just remember that since these threads are running in parallel (probably), you must be very careful about making cross-thread references. Unless you explicitly wait for one thread to finish, there is no guarantee that a value set in one thread will be available in another at any given time. And, as always, if you think a negative race condition might apply, please wrap variable access and modification code inside of CFLock tags.

Download Code Snippet ZIP File

Post Comment  |  Ask Ben  |  Other Searches  |  Print Page




Learning ColdFusion 9 - ColdFusion 9 tutorials, samples, examples, demos

Reader Comments

Jun 4, 2007 at 2:27 PM // reply »
112 Comments

@Ben:

Don't forget that the "name" attribute can take a list of threads to join. So in your loop code where you're creating the threads, you could append the name of the thread to a list and just do something like:

<cfthread
action="join"
name="#lstThreads#"
/>


Jun 4, 2007 at 2:28 PM // reply »
112 Comments

PS - I've also filed Enhanced #69430 with Adobe to allow the name attribute to be optional, so w/out a "name" attribute, it would just join all threads in the current page template together. :)


Jun 4, 2007 at 2:52 PM // reply »
7,207 Comments

@Dan,

Good catch. I forgot to mention that in my run-down. As for your enhancement, that would be awesome! I figure this is a use-case that will be used quite often and would help tremendously.


Jun 15, 2007 at 3:51 PM // reply »
25 Comments

Just in case you have not noticed, you can set the maximum number of cfthreads that can run parallely in the server, from administrator. That is the max size of thread pool dedicated to run cfthreads. By default it is set to 10 .
Another point to note is that if you specify a large number for it lets say 50, it does not mean all the 50 threads will be running all the time. The pool is dynamic and it adjusts according to the load. So at peak load, it will go upto 50 and when there is no load, it can drop down to 1.

Rupesh
Adobe CF Team.


Jun 15, 2007 at 5:53 PM // reply »
7,207 Comments

@Rupesh,

Thanks for pointing that out. Right now, I am doing most of my testing on HostMySite.com, so I don't have Admin access. I installed the Beta on my desktop at home, but something went screwy with the install, and neither the CFIDE nor the CFDOCS folder seems to have installed in the Coldfusion8 folder. I was running CF7 at the time, so I don't know if that messed it up. I will probably uninstall and re-install or just try to install again.

I can't wait to turn on my per-application-setting so that I can play with the app-specific mappings :)


Jun 25, 2009 at 11:09 AM // reply »
28 Comments

A quick note for anyone using cfthread on servers with high load running code which creates a large number of threads on multi-core processors...

Even if you increase the 'Maximum number of threads available for cfthread' in CF Administrator, your app may be throttled by JRun. If you're getting threads waiting for no apparant reason, bringing application performance to a crawl, you can try modifying your jrun.xml file.

Open it up and a little way down you'll see this line:
<service class="jrunx.scheduler.SchedulerService" name="SchedulerService">
Increase the numbers for activeHandlerThreads and minHandlerThreads (defaults are 25 & 20 respectively) to a much higher number, save the file, restart CF and try again.

If you get the same results from this change that we did, your application performance will increase massively.

Hope this saves someone all the trouble we've had! :)

George.


Jun 25, 2009 at 2:10 PM // reply »
7,207 Comments

@George,

Nice tip. I know nothing about messing with the JRUN, but let's just say that a massively popular site is a problem I'd like to have someday :)


Aug 24, 2009 at 8:42 AM // reply »
3 Comments

@George,

When you say "a much higher number", what kind of ballpark are we talking? And having made that change in jrun.xml, what did you then set the "Maximum number of threads available for cfthread" option to? Just match them up?

Peter


Aug 24, 2009 at 8:49 AM // reply »
28 Comments

@Peter,

I can't remember what we used for activeHandlerThreads and minHandlerThreads; the sysadmins deal with all this sort of stuff but I figured as it gave us such a performance gain I'd post it here. I know we started at around 1000 for each but it's no doubt been refined since we first discovered this setting.

The setting in CF admin is something you'll need to tweak yourself as it's going to depend on your hardware and a load of other factors. Try 50 and see how you get on, increasing it as necessary. Just watch the CPU monitor and memory usage so you don't completely kill your machines!

George.


Aug 24, 2009 at 9:23 AM // reply »
3 Comments

@George,

50 is what I'm trying now, so we'll see if it helps. I haven't touched the jrun.xml file at this point, but it is indeed set to the defaults of 25 & 20, so could be a limiting factor.

Will definitely keep an eye on it and tweak more if necessary :)

Thanks for the pointers.

Peter


Aug 24, 2009 at 9:29 AM // reply »
3 Comments

There is some good reading on this on Steven Erat's blog too I just found: http://www.talkingtree.com/blog/index.cfm?mode=entry&entry=942B6F54-45A6-2844-77AD4D08D7523481

Points out how *too* high could have negative effects.


Nov 24, 2009 at 9:10 PM // reply »
1 Comments

Ben, as always, thanks for the info. I also appreciate the content added by George.

I have a quick question on using the cfthread feature if you do not need to join the threads.

If you just fire off say 25 threads that are performing an action and inserting a record in the database, and the original file is a scheduled task that you have running daily, do you need to join the threads to close the connection?

I have a cfm page that is setup as a scheduled task that checks some monitoring counters and stores them in a transaction table. Using the threads really sped up the processing time, but I was wondering what the impact would be if I just let the threads do their thing. Any gains or losses from not joining the threads?

Thanks in advance!!


Jan 9, 2010 at 11:01 PM // reply »
7,207 Comments

@Richard,

You only ever need to join the threads IF you want to check their generated output or thread scopes (or if you are waiting to trigger subsequent work flows). If you don't care what the thread is doing, you can just let it run without joining.


Post Comment  |  Ask Ben

Recent Blog Comments
Feb 9, 2010 at 4:14 AM
Ask Ben: Converting a Query to an Array
Exellent script! Tried to make it shorter, gained approx 10% performance improvement :) [CODE] <cffunction name="queryToArray" access="public" returntype="array" output="false"hint="This turns a q ... read »
Feb 9, 2010 at 3:16 AM
Using jQuery's SlideUp() and SlideDown() Methods With Bottom-Positioned Elements
Just a quick fix for the code example above: $('.panel').height() should instead be... $('.panel').outerHeight() ... read »
Feb 9, 2010 at 3:09 AM
Using jQuery's SlideUp() and SlideDown() Methods With Bottom-Positioned Elements
I came across this page because I was searching for a certain behavior. I didn't just want the element to slide up; I wanted it's contents to slide up with it. As if the contents were written on a p ... read »
Feb 9, 2010 at 12:57 AM
Ask Ben: Creating A PDF And Attaching It To An Email Using ColdFusion
Have you got any reference to code explaining the procedure to use cfdocument, file to disk and attach to email? ... read »
Feb 8, 2010 at 11:27 PM
Why NULL Values Should Not Be Used in a Database Unless Required
@Randi, While I can appreciate your specific situation, I personally am not a fan of a situation where people have the option to run their own ad-hoc reports on the database. This is, honestly, a r ... read »
Feb 8, 2010 at 11:15 PM
Creating Microsoft Excel Documents With ColdFusion And XML
@Candice, No problem - glad you got it figured out. @David, To view the XLS document as an XML file, if I remember correctly, I actually opened up the document in Excel and then went "File > ... read »
Feb 8, 2010 at 11:11 PM
Converting An IP Address To An Integer Using MySQL (Thanks Julian Halliwell)
@Rob, As @Julian pointed out, you need to enable multiple queries in order to run queries containing semi-colons. For more info on that, take a look at this post: http://www.bennadel.com/blog/120 ... read »
Feb 8, 2010 at 11:08 PM
Muscle: Confessions Of An Unlikely Bodybuilder By Samuel Wilson Fussell
@Robert, Awesome. When you start reading it, I'd love for you to drop by and share your comments. I happen to love the book and am always happy to have a good conversation about it. ... read »