Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at cf.Objective() 2009 (Minneapolis, MN) with:

POST Streaming Upload Data From ColdFusion Using Java And Node.js

By Ben Nadel on

I can't find the email, but a while back, someone asked me about POSTing very large files with ColdFusion's CFHTTP and CFHTTPParam tags. This individual was running out of memory because ColdFusion apparently needed to load the entire file into the local RAM before posting it up to the target server. To get around this issue, I started poking into the Java layer (beneath the ColdFusion surface) and found the java.net.HttpURLConnection class. This Java class allows a URL connection to be held open with HTTP-like behavior, including chunked data streaming; and, this chunked data streaming allows us to post data a byte at a time, without having to know about the size of the local file.

 
 
 
 
 
 
 
 
 
 

I don't what to go into too much explanation since I only just discovered this Java class and started to play with it. But, from what I can gather, the connection to the target URL has both an output stream and input stream. The output stream represents the Upload and the input stream represents the Download (ie. the response).

As long as the connection has "chunking" turned on, we can write to the output (upload) stream and have the stream flush data without having to buffer the data entirely within the local memory. In this way, we can read the local file in and write it, a byte at a time, to the output (upload) stream.

NOTE: For my demo, I am reading in a byte at a time, which is probably horribly inefficient. In reality, you'd probably want a Buffered input stream; but, to keep it simple, I'm using byte-wise streaming as it allows granular control.

In order to see this chunked upload in progress, I am actually going to be posting to a local Node.js server. This Node.js server will then pipe the incoming POST data out to a different GET response. Furthermore, we'll have the ColdFusion POST pause after every 100 bytes. In this way, we can get a solid visual confirmation that the POST is, in fact, being sent in chunks, without having to be completely buffered in the server's memory.

Before we look at the ColdFusion code, which is extremely verbose, let's look at the Node.js code so we can see how the requests will be handled. In the following server configuration, we need to make a standard GET request to the Node.js server before we make our POST. The Node.js server will hold the GET response open until the POST request is registered. At that point, the data chunks from the POST will be written to the GET response.

server.js (Node.js Server Configuration)

  • // Include the necessary modules.
  • var sys = require( "sys" );
  • var http = require( "http" );
  •  
  •  
  • // ---------------------------------------------------------- //
  • // ---------------------------------------------------------- //
  •  
  •  
  • // For this demo, we are going to pipe the form upload POST into the
  • // response of a browser-based GET request.
  • //
  • // NOTE: You have to make the GET request *before* the POST.
  • var getResponse = null;
  •  
  •  
  • // ---------------------------------------------------------- //
  • // ---------------------------------------------------------- //
  •  
  •  
  • // Create an instance of the HTTP server.
  • var server = http.createServer(
  • function( request, response ){
  •  
  •  
  • // Check to see if the incoming request is a GET. If so,
  • // we're going to hold it open and pump the POST data through
  • // (once the post is made).
  • if (request.method === "GET"){
  •  
  •  
  • // Store the output stream for later.
  • getResponse = response;
  •  
  • // Set the 200-OK header.
  • response.writeHead(
  • 200,
  • { "content-type": "text/plain" }
  • );
  •  
  • // Write some data.
  • getResponse.write( "Waiting for POST...\n\n" );
  •  
  • // Log the hold-open.
  • console.log( "Holing GET request open for POST." );
  •  
  • // NOTE: We are not explicitly ending the response. This
  • // will hold it open until it times-out.
  •  
  •  
  • // Check to see if the reuqest is a POST.
  • } else if (request.method === "POST"){
  •  
  •  
  • // Make sure that we have a pending GET response.
  • if (getResponse === null){
  •  
  • // Log the issue.
  • console.log( "POST being denied." );
  •  
  • // We have no response to pipe the data to. Return an
  • // error response to the post.
  • response.writeHead(
  • 500,
  • { "content-type": "text/plain" }
  • );
  •  
  • // End the response.
  • return(
  • response.end( "No pending GET response!!" )
  • );
  •  
  • }
  •  
  •  
  • // If we made it this far than we have a GET request we
  • // are holding open and can pipe the POST data through
  • // without problem. Set the 200-OK header.
  • response.writeHead(
  • 200,
  • { "content-type": "text/plain" }
  • );
  •  
  • // Listen for data chunks to come through on the post.
  • // This will be the data that gets periodically flushed
  • // during our streaming POST.
  • request.on(
  • "data",
  • function( buffer ){
  •  
  • // Log the length of the buffer.
  • console.log( "Chunk:", buffer.length );
  •  
  • // Pipe the incoming data chunk into the response
  • // of our GET output stream.
  • getResponse.write( buffer.toString() );
  •  
  • }
  • );
  •  
  • // Listen for the completion of the POST. Once the POST
  • // is done, we will close both the POST and the GET
  • // response streams.
  • request.on(
  • "end",
  • function(){
  •  
  • // Close the POST stream.
  • response.end(
  • "\n\nEnded Node.js response." +
  • (new Date()).toString()
  • );
  •  
  • // Close the GET stream.
  • getResponse.end(
  • "\n\nEnded Node.js response." +
  • (new Date()).toString()
  • );
  •  
  • // Clear the GET response reference.
  • getResponse = null;
  •  
  • }
  • );
  •  
  •  
  • }
  •  
  •  
  • }
  • );
  •  
  • // Point the server to listen to the given port for incoming
  • // requests.
  • server.listen( 8080 );
  •  
  •  
  • // ---------------------------------------------------------- //
  • // ---------------------------------------------------------- //
  •  
  •  
  • // Write debugging information to the console to indicate that
  • // the server has been configured and is up and running.
  • sys.puts( "Server is running on 8080" );

As you can see, the incoming GET response is cached in the getResponse variable. Then, when the POST requests comes in, a "data"-event handler writes the buffered chunks to the cached GET response stream. Then, once the POST request is done, both the POST and GET responses are closed.

As an aside... I'm sorry, but Node.js is pretty badass!

Ok, now that we see the Node.js logic, let's take a look at the ColdFusion code that opens the connection to the target Node.js server and then begins to pipe the local file data, in chunks, to the output stream. I won't cover the code too much since 1) I don't know it in so much depth and 2) the code is quite heavily commented.

  • <!---
  • Create an instance of our target URL - the one to which we
  • are going to post binary form data. In our case, this will be
  • a local NODE.JS server because it will allow us to examine the
  • post in the Node.js console as it comes through.
  • --->
  • <cfset targetUrl = createObject( "java", "java.net.URL" ).init(
  • javaCast( "string", "http://localhost:8080" )
  • ) />
  •  
  • <!---
  • Now that we have our URL, let's open a connection to it. This
  • will give us access to the input (download) and output (upload)
  • streams for the target end point.
  •  
  • NOTE: This gives us an instance of java.net.URLConnection (or
  • one of its sub-classes).
  • --->
  • <cfset connection = targetUrl.openConnection() />
  •  
  • <!---
  • Be default, the connection is only set to gather target content,
  • not to POST it. As such, we have to make sure that we turn on
  • output (upload) before we access the data streams.
  • --->
  • <cfset connection.setDoOutput( javaCast( "boolean", true ) ) />
  •  
  • <!--- Since we are uploading, we have to set the method to POST. --->
  • <cfset connection.setRequestMethod( javaCast( "string", "POST" ) ) />
  •  
  • <!---
  • By default, the connection will locally buffer the data until it
  • is ready to be posted in its entirety. We don't want to hold it
  • all in memory, however; as such, we need to explicitly turn data
  • Chunking on. This will allow the connection to flush data to the
  • target url without having to load it all in memory (this is
  • perfect for when the size of the data is not known ahead of time).
  •  
  • NOTE: In our case, we're gonna set it small so we can see some
  • activity over the stream in realtime.
  • --->
  • <cfset connection.setChunkedStreamingMode( javaCast( "int", 50 ) ) />
  •  
  • <!---
  • When posting data, the content-type will determine how the
  • target server parses the incoming request. If the target server
  • is ColdFusion, this is especially crtical as it will throw an
  • error if it tries to parse this POST as a collection of
  • name-value pairs.
  • --->
  • <cfset connection.setRequestProperty(
  • javaCast( "string", "content-type" ),
  • javaCast( "string", "text/plain" )
  • ) />
  •  
  •  
  • <!---
  • Now that we have prepared the connection to the target URL, let's
  • get the output stream - this is the UPLOAD stream to which we can
  • write data to be posted to the target server.
  • --->
  • <cfset uploadStream = connection.getOutputStream() />
  •  
  •  
  • <!---
  • Let's open a connection to a local file that we will stream to
  • the output a byte at a time.
  •  
  • NOTE: There are more effficient, buffered ways to read a file
  • into memory; however, this is just trying to keep it simple.
  • --->
  • <cfset fileInputStream = createObject( "java", "java.io.FileInputStream" ).init(
  • javaCast( "string", expandPath( "./data2.txt" ) )
  • ) />
  •  
  •  
  • <!---
  • Before we start posting, we want to keep track of the number
  • of bytes that gets sent; this way, we can pause the stream
  • occassionally to give us time to watch the activity in the
  • NODE.JS console.
  • --->
  • <cfset byteCount = 0 />
  •  
  • <!--- Read the first byte from the file. --->
  • <cfset nextByte = fileInputStream.read() />
  •  
  • <!---
  • Keep reading from the file, one byte at a time, until we hit
  • (-1) - the End of File marker for the input stream.
  • --->
  • <cfloop condition="(nextByte neq -1)">
  •  
  • <!--- Increment the byte count. --->
  • <cfset byteCount++ />
  •  
  • <!--- Write this byte to the output (UPLOAD) stream. --->
  • <cfset uploadStream.write( javaCast( "int", nextByte ) ) />
  •  
  •  
  • <!---
  • Check to see if we are at 100 bytes. We want to pause the
  • upload every 100 bytes in order to view the activity.
  • --->
  • <cfif !(byteCount % 100)>
  •  
  • <!--- Flush the upload stream. --->
  • <cfset uploadStream.flush() />
  •  
  • <!--- Pause the upload. --->
  • <cfset sleep( 2000 ) />
  •  
  • </cfif>
  •  
  •  
  • <!--- Read the next byte from the file. --->
  • <cfset nextByte = fileInputStream.read() />
  •  
  • </cfloop>
  •  
  • <!--- Now that we're done streaming the file, close the stream. --->
  • <cfset uploadStream.close() />
  •  
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  •  
  • <!---
  • At this point, we have completed the UPLOAD portion of the
  • request. We could be done; or we could look at the input
  • (download) portion of the request in order to view the response
  • or the error.
  • --->
  • <cfoutput>
  •  
  • Response:
  • #connection.getResponseCode()# -
  • #connection.getResponseMessage()#<br />
  • <br />
  •  
  • </cfoutput>
  •  
  • <!---
  • The input stream is mutually exclusive with the error stream,
  • although both can return data. As such, let's try to access
  • the input stream... and then use the error stream if there is
  • a problem.
  • --->
  • <cftry>
  •  
  • <!--- Try for the input stream. --->
  • <cfset downloadStream = connection.getInputStream() />
  •  
  • <!---
  • If the input stream is not available (ie. the server returned
  • an error response), then we'll have to use the error output
  • as the response stream.
  • --->
  • <cfcatch>
  •  
  • <!--- Use the error stream as the download. --->
  • <cfset downloadStream = connection.getErrorStream() />
  •  
  • </cfcatch>
  •  
  • </cftry>
  •  
  •  
  • <!---
  • At this point, we have either the natural download or the error
  • download. In either case, we can start reading the output in
  • the same mannor.
  • --->
  • <cfset responseBuffer = [] />
  •  
  • <!--- Get the first byte. --->
  • <cfset nextByte = downloadStream.read() />
  •  
  • <!---
  • Keep reading from the response stream until we run out of bytes
  • (-1). We'll be building up the response buffer a byte at a time
  • and then outputting it as a single value.
  • --->
  • <cfloop condition="(nextByte neq -1)">
  •  
  • <!--- Add the byte AS CHAR to the response buffer. --->
  • <cfset arrayAppend( responseBuffer, chr( nextByte ) ) />
  •  
  • <!--- Get the next byte. --->
  • <cfset nextByte = downloadStream.read() />
  •  
  • </cfloop>
  •  
  • <!--- Close the response stream. --->
  • <cfset downloadStream.close() />
  •  
  • <!--- Output the response. --->
  • <cfoutput>
  •  
  • Response: #arrayToList( responseBuffer, "" )#
  •  
  • </cfoutput>

To see this in action, take a look at the video above. You'll be able to see that the local file (data2.txt) is loaded using an input stream and then posted, in small chunks, to the target server. This allows the file to be posted without it ever being fully loaded into the server's local memory.

NOTE: This approach does not use name-value pairs in its form data. That would make the form content much more complicated (way beyond the scope of this exploration). By using a "text/plain" content-type, I can only worry about posting a single value.

Most of the time, form-POST size is not an issue; however, if you have to post very large files, you can (apparently) find yourself running out of RAM. In order to handle large posts, it seems that you can dip down into the Java layer to stream data over a URL connection in smaller, bite-sized chunks (no pun intended).



Looking For A New Job?

100% of job board revenue is donated to Kiva. Loans that change livesFind out more »

Reader Comments

Very timely post for me. While I've not run into a memory issue as yet, I was concerned I might on a project I'm working on. Good stuff.

Reply to this Comment

@Jim,

Cool man. Keep us posted with anything you do. I assume this gets much more complicated if you need to post a file as *part* of a form post with other name-value pairs; as, then, you have to build the delimiters and what not.

I'll try to play around with that concept as well. I did that once to play with the File API in HTML5 JavaScript; I think the concept is exactly the same.

Reply to this Comment

Did you participate in the node.js knockout (hackathon)? There was a team here in Charlotte that stayed up Friday - Sunday.

Reply to this Comment

@Phillip,

I didn't :( I only have one hackathon under my belt so far.

@All,

I tried this kind of approach with a multi-part form data post:

http://www.bennadel.com/blog/2252-Apparently-ColdFusion-Cannot-Handle-Chunked-Multi-Part-Form-Data.htm

Unfortunately, ColdFusion as a target server doesn't seem to handle chunking and multi-part form data. At least, that's what I'm seeing in my small, mostly ignorant testing (I'm very new to this concept).

Reply to this Comment

Post A Comment

You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.