Creating A CFThread-Based Process In ColdFusion In Order To Learn About Concurrency

By Ben Nadel on December 23, 2010

As I have been going through my Seven Langauges in Seven Weeks book by Bruce Tate, one of the things that keeps tripping me up is the concept of concurrency with processes. Scala day 3 was my first real taste of processes and it completely baffled me. By Erlang day 3, things started to come together a bit; but, all in all, the idea of a process and actors and message passing is still a good deal beyond my grasp. As a thought experiment - something to help me wrap my head around the concept of a process - I thought I might try to build a fake process using ColdFusion's CFThread tag.

NOTE: This is just a personal exploration, not an instructional blog entry.

Before we look at this, it might be useful to understand the difference between a thread and a process. Since I don't fully understand the difference myself, I have copied an explanation by Alan Zeichick:

Both threads and processes are methods of parallelizing an application. However, processes are independent execution units that contain their own state information, use their own address spaces, and only interact with each other via interprocess communication mechanisms (generally managed by the operating system). Applications are typically divided into processes during the design phase, and a master process explicitly spawns sub-processes when it makes sense to logically separate significant application functionality. Processes, in other words, are an architectural construct.

By contrast, a thread is a coding construct that doesn't affect the architecture of an application. A single process might contain multiple threads; all threads within a process share the same state and same memory space, and can communicate with each other directly, because they share the same variables.

Threads typically are spawned for a short-term benefit that is usually visualized as a serial task, but which doesn't have to be performed in a linear manner (such as performing a complex mathematical computation using parallelism, or initializing a large matrix), and then are absorbed when no longer required. The scope of a thread is within a specific code module -- which is why we can bolt-on threading without affecting the broader application.

Because I am going to try and build a Process powered by CFThread, I am going to do a few things in order to help mimic the memory characteristics of a Process. For one, the CFThread tag will be encapsulated within a CFComponent; this way, its Variables scope will not overlap with the parent page's variable scope (ie. the primary page). As a secondary effort, I'm also going to call duplicate() on any message data that gets passed along with the message. This will perform a deep copy on the incoming data in order to ensure that it does not share any references with the calling context. The part of the process that responds to incoming messages can still reference shared, global memory spaces (ex. Request, Application, Session); however, that will have to be done on the expressed behalf of the programmer.

To model the Process, I created a Process.cfc ColdFusion component. This ColdFusion component can be used on its own or it can be extended. If it is used on its own, a receive() method must be passed into the Process.init() constructor as a behavior. If it is extended, on the other hand, the base Process.receive() method must be overridden by the extending class. The receive() method is the "handler" that gets called for each message that needs to be processed:

function receive( message ) :: any

In addition to some meta data, the message argument contains three crucial pieces of information:

from - This is an optional reference to the calling context. I have included this as a thought, but have not made use of it in my demo. This property would allow the receive() method to send messages back to the caller (From) as part of the processing.

NOTE: Since ColdFusion does not allow threads to spawn other threads, the concept of bidirectional message passing is really not possible to mimic. I have included this property mostly because this was done in the Erlang homework that I covered the other day.

type - The type of message being sent. Since ColdFusion doesn't perform complex pattern matching like Prolog, Scala, and Erlang, we need to send the type of message as a string (an event type if you will).

data - This is the optional data that the calling context passes along with the message. This is copied by value and does not share any references with the calling context.

Other than the receive() method, the Process.cfc ColdFusion component proves three public methods:

clearMailbox()
send( from, type, data )
sendAndWait( from, type, data ) :: response

The send() method is asynchronous; that is, calling send() will not block the primary thread or wait for a response. sendAndWait(), on the other hand, will block the primary thread and return the response of the receive() method. clearMailbox() will block the primary thread until the internal thread has processed each message.

Ok, now that we see the basic interface for a Process, let's take a look at how it might be used. In the following demo, I am using an instance of Process.cfc to handle the parallel downloading and thumbnailing of images. In order to keep the demo as simple as possible, my receive() method stores the thumbnailed images in the Request scope.

NOTE: Remember, since one CFThread cannot spawn another CFThread, there are serious limitations with being able to demonstrate bidirectional message passing between concurrent processes.

<!--- Define an array of images to download. --->
<cfset imageUrls = [
	"http://farm1.static.flickr.com/4/8969523_6796b498b4.jpg",
	"http://farm4.static.flickr.com/3099/2403962317_b4aed469f8.jpg",
	"http://farm3.static.flickr.com/2069/2278240587_fe4a0655d9_z.jpg",
	"http://farm4.static.flickr.com/3214/2358502569_fd0f079edc.jpg",
	"http://farm3.static.flickr.com/2325/2279029454_7e68e02253_z.jpg",
	"http://farm3.static.flickr.com/2060/2278238407_9781da72cc_z.jpg",
	"http://farm2.static.flickr.com/1015/1135595520_1aced85265.jpg"
	] />

<!---
	Define our queue of images to store the binary data. I know that
	this is a shared resource and we should be sending messages, but
	for the ease of demo, I'm just gonna go this route.
--->
<cfset request.images = [] />


<!--- ----------------------------------------------------- --->
<!--- ----------------------------------------------------- --->


<!---
	Define the function that will act as our process responder. We
	can pass this in as a behavior, OR we can extend Process.cfc and
	override the receive() method.
--->
<cffunction name="handleMessage">

	<!--- Define the arguments. --->
	<cfargument
		name="message"
		type="struct"
		required="true"
		/>

	<!--- Define the local scope. --->
	<cfset var local = {} />

	<!---
		Check to what kind of message this is. Since ColdFusion
		can't do "pattern matching" at the deconstruction level,
		we just use string values.
	--->
	<cfif (arguments.message.type eq "download")>

		<!--- Download the image. --->
		<cfhttp
			result="local.get"
			method="get"
			url="#arguments.message.data#"
			getasbinary="yes"
			/>

		<!--- Create a thumbnail of the image. --->
		<cfimage
			name="local.thumbnail"
			action="resize"
			source="#local.get.fileContent#"
			width=""
			height="100"
			/>

		<!--- Add the image to the global queue. --->
		<cfset arrayAppend(
			request.images,
			local.thumbnail
			) />

	</cfif>

	<cfreturn />
</cffunction>


<!---
	Create a new process and pass in our handler as a behavior
	(we could also have extended Process instead of passing a
	receive() behavior in).
--->
<cfset process = createObject( "component", "Process" ).init(
	receive = handleMessage
	) />


<!--- ----------------------------------------------------- --->
<!--- ----------------------------------------------------- --->


<!--- Loop over all the images and send the download message. --->
<cfloop
	index="imageUrl"
	array="#imageUrls#">

	<cfset process.send(
		type = "download",
		data = imageUrl
		) />

</cfloop>

<!--- Now, keep looping until the images have been downloaded. --->
<cfloop condition="true">

	<!--- Check to see if the images have been downloaded. --->
	<cfif (arrayLen( request.images ) eq arrayLen( imageUrls ))>

		<cfbreak />

	</cfif>

	<!--- Output a dot to denote waiting. --->
	.

	<!---
		Sleep the top-level thread - the Process should be running
		concurrently and will NOT be affected by this.
	--->
	<cfthread
		action="sleep"
		duration="25"
		/>

	<!--- Flush our "waiting" indicator. --->
	<cfflush />

</cfloop>

<br />
<br />

<!--- Ouptut our images. --->
<cfloop
	index="thumbnail"
	array="#request.images#">

	<!--- Write a PNG IMG tag. --->
	<cfimage
		action="writeToBrowser"
		source="#thumbnail#"
		/>

</cfloop>

As you can see in this code, I create the user defined function (UDF), handleMessage(). This UDF will act as our receive() behavior which you can see I am passing into the Process.cfc constructor during instantiation. Once I have my Process.cfc instance, I then send it a number of "download" messages with an accompanied image URL.

As the Process is churning away, addressing messages, I am outputting a "." character to the screen in order to demonstrate that this Process is executing in parallel to the primary thread. And, when we run the above code, we get the following output (Video):

As you can see, the downloading and the thumbnailing of the remote images took place in parallel with the parent page.

Now that we see how the Process can be used, let's take a look at the Process.cfc ColdFusion component. What you'll notice in the code below is that the Process is basically powered by a composed CFThread tag. The CFThread tag keeps looping, addressing one message at a time. If a new message is sent to the Process while the CFThread tag is running, it simply gets pushed onto the mailbox. If, however, a new message is sent to the Process and no internal thread is running, the Process spawns a new CFThread tag and continues to monitor the mailbox queue.

Process.cfc

<cfcomponent
	output="false"
	hint="I am a base process. I can either be exteneded or, a Receive function can be passed in during initialization.">


	<cffunction
		name="init"
		access="public"
		returntype="any"
		output="false"
		hint="I return an initialize component.">

		<!--- Define arguments. --->
		<cfargument
			name="receive"
			type="any"
			required="false"
			hint="I am the receive method. You can pass this in as a behavior if you don't want to extend the Process component."
			/>

		<!--- Define the local scope. --->
		<cfset var local = {} />

		<!--- Set up the unique ID of this process. --->
		<cfset variables.id = "process_#hash( createUUID() )#" />

		<!---
			Set up an index for the thread that powers the process;
			since each thread in a given page needs to be unique
			named, we will need to increment the ID everytime a new
			thread is launched.
		--->
		<cfset variables.index = 0 />

		<!--- Create a name for the internal thread. --->
		<cfset variables.name = "#variables.id#_#variables.index#" />

		<!--- Set up the mailbox. This holds the message queue. --->
		<cfset variables.mailbox = [] />

		<!---
			Check to see if a receive method was passed-in. If so,
			then override the existing version.
		--->
		<cfif structKeyExists( arguments, "receive" )>

			<!---
				Override the internal version - this is what the
				thread will use when processing the messages.
			--->
			<cfset this.receive = arguments.receive />

		</cfif>

		<!--- Return this object reference. --->
		<cfreturn this />
	</cffunction>


	<cffunction
		name="addMessage"
		access="public"
		returntype="struct"
		output="false"
		hint="I add the given message to the internal queue and return the message item.">

		<!--- Define arguments. --->
		<cfargument
			name="from"
			type="any"
			required="false"
			default=""
			hint="I am the object sending the message."
			/>

		<cfargument
			name="type"
			type="string"
			required="true"
			hint="I am the type of message being sent."
			/>

		<cfargument
			name="data"
			type="any"
			required="false"
			default=""
			hint="I am the data being sent with the message."
			/>

		<!--- Define the local scope. --->
		<cfset var local = {} />

		<!---
			Create a new message item. Notice that when we add the
			message, we are duplicating the Data key. This performs
			a deep copy and helps to ensure that this process
			doesn't share any data references with a parallel process
			or thread.
		--->
		<cfset local.message = {
			id = createUUID(),
			from = arguments.from,
			type = arguments.type,
			data = duplicate( arguments.data ),
			complete = false,
			error = "",
			response = ""
			} />

		<!---
			Add the message to the queue. Since the mailbox is a
			shared resource, lock exclusive access to it.
		--->
		<cflock
			name="#variables.id#-mailbox"
			type="exclusive"
			timeout="50">

			<!--- Add the message to the queue. --->
			<cfset arrayAppend( variables.mailbox, local.message ) />

		</cflock>

		<!--- Make sure the mailbox is being monitored. --->
		<cfset this.monitorMailbox() />

		<!--- Return the message. --->
		<cfreturn local.message />
	</cffunction>


	<cffunction
		name="clearMailbox"
		access="public"
		returntype="void"
		output="false"
		hint="I wait for the internal thread to finish processing the mailbox.">

		<!---
			Because we are checking on the thread which is a shared
			resource, we have to lock exclusive access around its
			status.
		--->
		<cflock
			name="#variables.id#-thread"
			type="exclusive"
			timeout="100">

			<!--- Check a thread object. --->
			<cfif structKeyExists( cfthread, variables.name )>

				<!--- Join the thread. --->
				<cfthread
					name="#variables.name#"
					action="join"
					/>

			</cfif>

		</cflock>

		<!--- Return out. --->
		<cfreturn />
	</cffunction>


	<cffunction
		name="monitorMailbox"
		access="public"
		returntype="void"
		output="false"
		hint="I monitor the mailbox asynchronously, receiving one message at a time.">

		<!--- Define the local scope. --->
		<cfset var local = {} />

		<!---
			Check to see if a thread is running and monitoring
			the mailbox. Since the thread is a shared resource,
			lock access to the spawning of it.
		--->
		<cflock
			name="#variables.id#-thread"
			type="exclusive"
			timeout="100">

			<!---
				Check for a running thread. Since there is a race
				condition between when the internal loop ends and
				when the thread actually ends, we have to get a
				little tricky here to make sure we don't miss a
				new message.
			--->
			<cfif !structKeyExists( cfthread, variables.name )>

				<!--- There is no thread at all, so start one. --->
				<cfset this.spawnThread() />

			<cfelseif listFindNoCase( "completed,terminated", cfthread[ variables.name ].status )>

				<!---
					There is an existing thread, but it appears to
					have completed running (or has errored out). In
					such a case, we want to spawn a new one.
				--->
				<cfset this.spawnThread() />

			<cfelse>

				<!---
					There is an existing thread AND it appears to be
					running. Therefore, we need to check to see if it
					is addressing a message. However, in order to
					make sure that we don't hit a race condition on
					this check, let's lock the mailbox access.

					NOTE: Since the thread is also locking on this
					when popping messages from the mailbox, we should
					not have to worry about dirty reads.
				--->
				<cflock
					name="#variables.id#-mailbox"
					type="exclusive"
					timeout="50">

					<!---
						Check to see if the internal thread has a
						message that it is going to address. If it
						doesn't AND it has a previousMessage, then
						that means that we are in that weird timing
						where the thread IS running, but it is
						winding down and will not have the chance to
						address any more messages before it is
						terminated.

						NOTE: If there is no previousMessage, it
						means that the thread has been started, but
						it has not had time to grab it's first
						message out of the mailbox.
					--->
					<cfif (
						structKeyExists( cfthread[ variables.name ], "previousMessage" ) &&
						!structKeyExists( cfthread[ variables.name ], "message" )
						)>

						<!--- Spawn a new thread. --->
						<cfset this.spawnThread() />

					</cfif>

				</cflock>

			</cfif>

		</cflock>

		<!--- Return out. --->
		<cfreturn />
	</cffunction>


	<cffunction
		name="receive"
		access="public"
		returntype="any"
		output="false"
		hint="I respond to messages from the mailbox - the queue of messages send to this process.">

		<!--- Define arguments. --->
		<cfargument
			name="message"
			type="struct"
			required="true"
			hint="I am a message that needs to be responded to."
			/>

		<!---
			NOTE: This the function that needs to be overridden if
			this Process component is extended. The incoming message
			will have, among other values:

			- From (sender if provided)
			- Type (type of message being sent)
			- Data (data being sent along with event)
		--->

		<!--- Return out. --->
		<cfreturn false />
	</cffunction>


	<cffunction
		name="send"
		access="public"
		returntype="any"
		output="false"
		hint="I send a message to the internal process. This is an ASYNCHRONOUS send. If you want to using a blocking-send, use sendAndWait.">

		<!--- Define arguments. --->
		<cfargument
			name="from"
			type="any"
			required="false"
			default=""
			hint="I am the object sending the message (this can be used when responding to messages)."
			/>

		<cfargument
			name="type"
			type="string"
			required="true"
			hint="I am the type of message being sent. This will be used in the receive branching logic."
			/>

		<cfargument
			name="data"
			type="any"
			required="false"
			default=""
			hint="I am the data being sent with the message."
			/>

		<!---
			Since we are not waiting around for a message, we can
			simply add the message and carry on.
		--->
		<cfset this.addMessage( argumentCollection = arguments ) />

		<!--- Return this object reference for method chaining. --->
		<cfreturn this />
	</cffunction>


	<cffunction
		name="sendAndWait"
		access="public"
		returntype="any"
		output="false"
		hint="I send a message to the internal process and return the result. I block the primary thread until the process returns with the relevant result.">

		<!--- Define arguments. --->
		<cfargument
			name="from"
			type="any"
			required="false"
			default=""
			hint="I am the object sending the message (this can be used when responding to messages)."
			/>

		<cfargument
			name="type"
			type="string"
			required="true"
			hint="I am the type of message being sent. This will be used in the receive branching logic."
			/>

		<cfargument
			name="data"
			type="any"
			required="false"
			default=""
			hint="I am the data being sent with the message."
			/>

		<!--- Define the local scope. --->
		<cfset var local = {} />

		<!---
			Add the message to the queue and hold onto the message
			item that gets created. This will allow us to wait for
			the appropriate message.
		--->
		<cfset local.message = this.addMessage(
			argumentCollection = arguments
			) />

		<!--- Keep looping until the message is complete. --->
		<cfloop condition="( !local.message.complete )">

			<!---
				Sleep the parent thread momentarily just to give
				the processor a break.
			--->
			<cfthread
				action="sleep"
				duration="10"
				/>

		</cfloop>

		<!--- Return the message response. --->
		<cfreturn local.message.response />
	</cffunction>


	<cffunction
		name="spawnThread"
		access="public"
		returntype="void"
		output="false"
		hint="I spawn a new internal thread for processing the mailbox. NOTE: This will be called from WITHIN an exclusive lock.">

		<!---
			There are items in the mailbox that need to be
			addressed. Spawn a thread to start monitoring them.
			To do this, we need to increment the thread index.

			NOTE: Since this is called within a CFLock, we don't
			have to worry about race conditions around these
			shared items.
		--->
		<cfset variables.index++ />

		<!--- Create a new name. --->
		<cfset variables.name = "#variables.id#_#variables.index#" />

		<!--- Spawn the thread. --->
		<cfthread
			name="#variables.name#"
			action="run">

			<!---
				Now that we are in the process' internal thread,
				we are going to keep looping until we have
				addressed all of the messages in the mailbox.
			--->
			<cfloop condition="true">

				<!---
					Get the next item from the mailbox. However,
					since the mailbox is a shared resource, lock
					exclusive access to it.
				--->
				<cflock
					name="#variables.id#-mailbox"
					type="exclusive"
					timeout="50">

					<!---
						If there is an existing message (if this
						thread is in a secondary iteration), add
						it to the previous message property. This
						is taken into account when determining if
						a new thread needs to be spawned.
					--->
					<cfif structKeyExists( thread, "message" )>

						<!--- Store previous message. --->
						<cfset thread.previousMessage = thread.message />

					</cfif>

					<!---
						Check to make sure there is a message to
						address.
					--->
					<cfif arrayLen( variables.mailbox )>

						<!--- Get the first message. --->
						<cfset thread.message = variables.mailbox[ 1 ] />

						<!--- Pop it off the queue. --->
						<cfset arrayDeleteAt( variables.mailbox, 1 ) />

					<cfelse>

						<!---
							Since there is no message, delete any
							memory of it from the previous loop
							iteration.
						--->
						<cfset structDelete( thread, "message" ) />

					</cfif>

				</cflock>

				<!---
					Check to see if a message was popped out of the
					mailbox (if there were no messages, we'll have
					nothing left to do).
				--->
				<cfif structKeyExists( thread, "message" )>

					<!---
						Pass the message off to the receive method.
						Remember, this might be overridden; it might
						also have been passed-in as a behavior.
					--->
					<cftry>

						<!---
							Invoke receive and store the response.
							If receive() returns Void, then the
							sendAndWait() function will fail.
						--->
						<cfset thread.message.response = this.receive( thread.message ) />

						<!--- Catch any error. --->
						<cfcatch>

							<!--- Woops! Store the error. --->
							<cfset thread.message.error = cfcatch />

						</cfcatch>

					</cftry>

					<!---
						Now that the message has been dealt with,
						flag it as complete.
					--->
					<cfset thread.message.complete = true />

				<cfelse>

					<!---
						No message to address so stop looping. If
						another message is added, the thread can
						always be re-spawned.
					--->
					<cfbreak />

				</cfif>

			</cfloop>

		</cfthread>

		<!--- Return out. --->
		<cfreturn />
	</cffunction>

</cfcomponent>

Because our internal CFThread tag is making references to a shared resource - the mailbox - a good amount of CFLock'ing needs to be done in order to make sure that no funny race conditions exist. The hardest race condition to deal with was the outlier case in which a new message gets added to the mailbox just as the internal CFThread tag is ending its execution. I'm able to get around that condition through various locks and property checking; but, getting this all to work was not so trivial. And, to be honest, I am not sure if I actually did cover all the various race conditions.

As a ColdFusion developer, you might look at this and ask yourself:

Why go through all the trouble? Why not just create a CFThread tag in the parent page, do all the downloading and thumbnailing, and then Join the thread back to the parent page?

I would say that if I was only trying to get the job done, you are absolutely correct. Using a ColdFusion CFThread tag directly provides all of the same benefits of asynchronous, parallel processing without any of the complexity and overhead that comes with message queuing. But, I wasn't trying to "just get the job done;" rather, I was trying to think in terms of Processes and actors and concurrent message passing.

I've been struggling to wrap my head around the Process-based concurrency used in languages like Scala and Erlang. As such, I thought it might be helpful to try and mimic process-like behavior in a language in which I am very comfortable (ColdFusion). I am not putting this out there as an approach that should be taken in ColdFusion; in fact, ColdFusion's CFThread tag does not make true bidirectional message passing a possibility. Rather, I am simply exploring process-mimicry in an attempt to gain a level of comfort that can then be applied in other languages that do make use of Processes.

Want to use code from this post? Check out the license.

Short link: https://bennadel.com/go/2084

Reader Comments

WebManWalking Dec 23, 2010 at 12:31 PM

81 Comments

@Ben,

My big eye-opener about modern multicore computers and concurrency was Java Concurrency in Action:

http://www.amazon.com/Java-Concurrency-Practice-Brian-Goetz/dp/0321349601/ref=sr_1_1?ie=UTF8&qid=1293123640&sr=8-1

It's written by the folks who wrote the java.util.concurrent package, so they know what they're talking about. It's essentially a tutorial for Java developers who want to use the package. As such, it's not really directly useful to us ColdFusion developers, but one hopes that the Adobe's developers of ColdFusion are reading the heck out of it.

Java's commonly regarded as a thread-safe language, but the book really goes into detail about how easy it is to mess that up. You mentioned (quoted) about how threads share the same variables. Well, not really, not if they end up on separate cores/CPUs. The only way to guarantee that you're getting the current value of a variable is to lock it. Part of the locking mechanism is to flush the value out to memory if it's cached, say, in a register. Until then, your changes to the variable aren't visible in other cores.

Visibility is the hidden mind-blower of concurrency.

Tim Leach Dec 23, 2010 at 1:31 PM

39 Comments

This reminds me of a utility I had created once. not nearly complex, but was trying to accomplish the same thing, where I would actually render a main page, and use ajax requests to spawn multiple processes, and each process would update it's status in a session var, and I had a ping ajax request that would tell me and the user how each one was doing, and when they all were done.

It was my way of getting around giving the user feedback, when FB3 doesn't allow one to use cfflush.

Looking at your code though, wouldn't it be easier, to setup the image resizer as a local webservice? Then you could post actions to it, and monitor it's progress via a shared scope, but not have to do all the funky locking?

Ben Nadel Dec 23, 2010 at 2:13 PM

15,674 Comments

@WebManWalking,

I believe Marc Esher was also telling me about that book (I think - sorry if I am forgetting). I think he even said he and Barney Boisvert were trying to use the Java concurrency library in ColdFusion but were having some problems with the JavaLoader or what not.

I'll see how I feel after Seven Languages in Seven Weeks. Perhaps I'll add it to my queue.

@Tim,

I think there are a number of ways to go about making the image processing easier. But, the image processing wasn't really something I was trying to do "well." Mostly, I just wanted something that took a relatively long time to process so that I could actually *see* threads processing in parallel (rather than both appearing to finish instantly).

Adam Dec 24, 2010 at 11:27 AM

4 Comments

Thanks Ben, I think i'll have to try some experiments of my own to get into my head though.

Ben Nadel Dec 24, 2010 at 12:17 PM

15,674 Comments

@Adam,

CFThread is definitely an amazing feature of ColdFusion. It lowers the bar to parallel processing in a way that I don't think a lot of languages have accomplished. As such, I figured it would be a good starting place to learn about some other approaches to concurrency.

Brad Dec 25, 2010 at 2:04 PM

18 Comments

Lowering the bar is an understatement!

<cfthread>
//Do stuff

Well, I came from Java... It's what I learned first. All that initial hype won me over ;).

Writing threads from scratch in Java? Little bit complicated and it implies you have a knowledge of design patterns. The way you create threads requires a factory and a main thread to maintain state and feed the threads if you needed this.

This is also the method of proper GUI programming. Whenever your main GUI thread does something, you'd freeze up. My experiments into Winamp show it uses over 200 threads at any time.

My first attempt at a 'server' TCP/IP program was mind boggling. You had to spin off a thread for EVERY connection! If you want a great way to learn, aka a headache, look at their server-client tutorials..

Anyway. I used CFThread just today to read 1000s of files, regex and insert into the DB. 1000 inserts a second vs 3-4 running on a single thread.

(redid some datafeed history for the last few years)

Brad Dec 25, 2010 at 2:30 PM

18 Comments

Hey, I was also just thinking about your original intent for CFThread using CFHTTP.

You said you'd refresh the page in order to avoid running out of resources? Here's a tidbit you might want to turn into a blog if you haven't yet..

This calls garbage cleanup. I run this on every page that I have to do:

Basically, If I extend the default time out? I manually do garbage clean up. People say this causes a significant delay in code execution, but I haven't run any benchmarks and I haven't felt any drop in performance that would cause me to rethink my decision on using it.

My JRun usage never passes 300 megs, prior to his, calling CFHTTP about 16k times would leave me at 1000 megs usage. Add this in after every 1000 calls? Back down to 300 megs.

It must be that when they open the java, the connection isn't called with .close() until GC runs. Same with File IO ? Because The script I was talking about before this seems to increase at an even greater rate when working with flat datafeed files.

And, not to miss out on bashing php: Where's PHP's threading? ;) hehe.

Randall Dec 28, 2010 at 10:43 AM

167 Comments

@ Everyone - loving this thread ....about threads. Only wish I had CF9.

@Ben - not sure if it's only my computer, but the video wasn't showing.

Ben Nadel Dec 28, 2010 at 10:56 AM

15,674 Comments

@Brad,

Actually, it's interesting to see how many threads run in an application. Now that you mention TCP/IP, I popped open my Mac's "Activity Monitor" and JRUN has like 79 threads running. And I'm not even using it at the moment :) Yeah, I have to believe that creating these beasts is quite a handful.

So, as you say, "lowering the bar" with CFThread is perhaps a bit of an understatement. It really does make things so very easy and very fast for many types of activities (as you outlined).

Going back to your garbage collection point, I one time got a tip from Gert Franz (Railo); he said that if you give the parent page a small sleep (ie. CFThread/sleep=10), it's a minimal pause, but it's enough to allow the native garbage collection to run automatically.

After he told me that, I did my hardest to try to create a situation where I could actually run out of system memory within ColdFusion. Unfortunately (or fortunately, depending on how you look at it), I was not able to do it! I couldn't get to a point where I could even test the sleep-approach to GC.

I suppose calling it manually would do the same thing. But again, I'm having trouble getting to a point where I can even test that.

@Randall,

The site may have been down? I host the JING videos on my server. It seems to be working for me. As far as CF8, I think this post was coding in CF8 (I have both CF8 and CF9 locally, but I only keep one running at a time).

Brad Dec 29, 2010 at 10:10 AM

18 Comments

@Ben

You have a problem running out of resources??? Perhaps I can attribute that to CF9 then? ;) Or do you just have an inexhaustible supply of RAM?

A great deal of my work still resides in CF8 and it's fairly easy to run out of resources. This is of course the 32 bit version of CF8, and I get about 1400 megs per server.

I have also noticed that a simple <cffile /> (note: the closing />) will make a difference from <cffile > ... (all of this being in CF8). So if this is automatically how you program, you might not experience it.

And yeah, it's hard to resist threading almost anything you possibly can. It's like when you first experienced Ajax... everything had to be in ajax! haha.

Ben Nadel Dec 29, 2010 at 10:22 AM

15,674 Comments

@Brad,

Even when I run my code through CF8, I am still not able max our my RAM :) And I only have like 760MB for JRUN. I think the difference might be that I am not running on a Mac OSX locally. Before, when I was having this problem, I was on Windows XP. I think my Mac is 64 bit and my Windows was 32 bit (I think... I know very little about computers themselves).

Perhaps there's something about my Mac or the CF that runs on it that its more efficient??

Brad Dec 29, 2010 at 2:03 PM

18 Comments

Well, OSX and Windows are different animals. Some code needs to be rewritten to be compatible across both platforms. Not developer high level CF code mind you, but the compiled java version of cold fusion needs to be rewritten.

Many things are different across the two operating systems such as: file system access... threading... socket access... to name a few. Perhaps when they wrote the library for Windows, they wrote it less efficiently then they did in Mac OS? Perhaps this problem may even be down to the inefficient versions of the JVM itself? Each JVM needs to be maintained on EACH operating system... and, to be honest, perhaps Apple's maintained version is just better? I don't know, that would be the territory of a systems analyst perhaps.

Ben Nadel Jan 6, 2011 at 10:16 AM

15,674 Comments

@Brad,

Very interesting stuff. I don't think it really ever occurred to me that so much would differ at the JVM level. I just figured that stuff was all consistent.

Perhaps I'll keep trying to eat my RAM. I would love love love to see if the CFThread/sleep hack actually works for garbage collection and its driving me crazy that I can not seem to put myself in a situation where it can be tested! I *want* it to break :D

Oh my chickens, this post is old!

Hit me up on Twitter if you want to discuss it further.