Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at CFUNITED 2010 (Landsdown, VA) with: Mike Collins

Chunking File Uploads With Plupload And ColdFusion

By Ben Nadel on

I've been a long-time super-fan and consumer of Plupload - the multi-runtime JavaScript file uploader. But lately, I've been having some issues with users uploading very large files. In my own testing, if I select a file larger than 250MB, it never leaves the browser (confirmed by all server-side monitoring). As such, I wanted to take a quick look at Plupload's "chunking" feature, which breaks large files up into smaller chunks which are then POSTed to the server independently.


 
 
 

 
  
 
 
 

View the Plupload Chunk Project on GitHub.

When you chunk a file with Plupload, you split the file up into binary blobs of predefined size (ex, "1024kb"). Plupload then uploads each one of these blobs with additional metadata about where it resides within the master file. It is then the responsibility of the server to take all of these binary chunks and rebuild the master file on the server.

When chunking, Plupload provides an additional event hook - "ChunkUploaded." This fires every time one of the chunks is uploaded successfully to the server. However, the original "FileUploaded" still fires after all the chunks in a given file have been uploaded.

In my exploration, I'm keeping the chunks small - "100kb" - so that I can see the chunking behavior in action. Then, on the server side, I am simply appending each chunk to a transient file as it arrives on the server. From the client-side / Plupload perspective, almost no code needs to change; all I have to do is define a non-zero chunk size (ex, "100kb") and Plupload will automatically start posting files across multiple request (if necessary).

  • <!doctype html>
  • <html>
  • <head>
  • <meta charset="utf-8" />
  •  
  • <title>
  • Chunking File Uploads With Plupload And ColdFusion
  • </title>
  •  
  • <link rel="stylesheet" type="text/css" href="./assets/css/styles.css"></link>
  • </head>
  • <body>
  •  
  • <h1>
  • Chunking File Uploads With Plupload And ColdFusion
  • </h1>
  •  
  • <div id="uploader" class="uploader">
  •  
  • <a id="selectFiles" href="##">
  •  
  • <span class="label">
  • Select Files
  • </span>
  •  
  • <span class="standby">
  • Waiting for files...
  • </span>
  •  
  • <span class="progress">
  • Uploading - <span class="percent"></span>%
  • </span>
  •  
  • </a>
  •  
  • </div>
  •  
  •  
  • <!-- Load and initialize scripts. -->
  • <script type="text/javascript" src="./assets/jquery/jquery-2.1.0.min.js"></script>
  • <script type="text/javascript" src="./assets/plupload/js/plupload.full.min.js"></script>
  • <script type="text/javascript">
  •  
  • (function( $, plupload ) {
  •  
  • // Find and cache the DOM elements we'll be using.
  • var dom = {
  • uploader: $( "#uploader" ),
  • percent: $( "#uploader span.percent" )
  • };
  •  
  •  
  • // Instantiate the Plupload uploader.
  • var uploader = new plupload.Uploader({
  •  
  • // Try to load the HTML5 engine and then, if that's not supported, the Flash
  • // fallback engine.
  • // --
  • // NOTE: I had read that Chunking was not available in the Flash runtime; however,
  • // after forcing the Flash runtime, it seems that chunking IS available, but is
  • // significantly slower than it is in the html5 runtime.
  • runtimes: "html5,flash",
  •  
  • // The upload URL - this is where chunks OR full files will go.
  • url: "./upload.cfm",
  •  
  • // The ID of the drop-zone element.
  • drop_element: "uploader",
  •  
  • // To enable click-to-select-files, you can provide a browse button. We can
  • // use the same one as the drop zone.
  • browse_button: "selectFiles",
  •  
  • // For the Flash engine, we have to define the ID of the node into which
  • // Pluploader will inject the <OBJECT> tag for the flash movie.
  • container: "uploader",
  •  
  • // The URL for the SWF file for the Flash upload runtime for browsers that
  • // don't support HTML5.
  • flash_swf_url: "./assets/plupload/js/Moxie.swf",
  •  
  • // Needed for the Flash environment to work.
  • urlstream_upload: true,
  •  
  • // The name of the form-field that will hold the upload data.
  • file_data_name: "file",
  •  
  • // Send any additional params (ie, multipart_params) in multipart message
  • // format.
  • multipart: true,
  •  
  • // This defines the maximum size that each file chunk can be.
  • // --
  • // NOTE: I'm setting it particularly low for the demo. In general, you don't
  • // want it to be too small because the chunking has a performance hit. Chunking
  • // is meant for fault-tolerance and browser limitations.
  • chunk_size: "100kb",
  •  
  • // If the upload of a chunk fails, this is the number of times the chunk
  • // should be re-uploaded before the upload (overall) is considered a failure.
  • max_retries: 3
  •  
  • });
  •  
  •  
  • // Set up the event handlers for the uploader.
  • uploader.bind( "Init", handlePluploadInit );
  • uploader.bind( "Error", handlePluploadError );
  • uploader.bind( "FilesAdded", handlePluploadFilesAdded );
  • uploader.bind( "QueueChanged", handlePluploadQueueChanged );
  • uploader.bind( "BeforeUpload", handlePluploadBeforeUpload );
  • uploader.bind( "UploadProgress", handlePluploadUploadProgress );
  • uploader.bind( "ChunkUploaded", handlePluploadChunkUploaded );
  • uploader.bind( "FileUploaded", handlePluploadFileUploaded );
  • uploader.bind( "StateChanged", handlePluploadStateChanged );
  •  
  • // Initialize the uploader (it is only after the initialization is complete that
  • // we will know which runtime load: html5 vs. Flash).
  • uploader.init();
  •  
  •  
  • // ------------------------------------------ //
  • // ------------------------------------------ //
  •  
  •  
  • // I provide access to the uploader and the file right before the upload is about
  • // to being. This allows for just-in-time altering of the settings.
  • function handlePluploadBeforeUpload( uploader, file ) {
  •  
  • console.log( "Upload about to start.", file.name );
  •  
  • }
  •  
  •  
  • // I handle the successful upload of one of the chunks (of a larger file).
  • function handlePluploadChunkUploaded( uploader, file, info ) {
  •  
  • console.log( "Chunk uploaded.", info.offset, "of", info.total, "bytes." );
  •  
  • }
  •  
  •  
  • // I handle any errors raised during uploads.
  • function handlePluploadError() {
  •  
  • console.warn( "Error during upload." );
  •  
  • }
  •  
  •  
  • // I handle the files-added event. This is different that the queue-changed event.
  • // At this point, we have an opportunity to reject files from the queue.
  • function handlePluploadFilesAdded( uploader, files ) {
  •  
  • console.log( "Files selected." );
  • // Example: file.splice( 0, 1 ).
  •  
  • }
  •  
  •  
  • // I handle the successful upload of a whole file. Even if a file is chunked,
  • // this handler will be called with the same response provided to the last
  • // chunk success handler.
  • function handlePluploadFileUploaded( uploader, file, response ) {
  •  
  • console.log( "Entire file uploaded.", response );
  •  
  • }
  •  
  •  
  • // I handle the init event. At this point, we will know which runtime has loaded,
  • // and whether or not drag-drop functionality is supported.
  • function handlePluploadInit( uploader, params ) {
  •  
  • console.log( "Initialization complete." );
  • console.info( "Drag-drop supported:", !! uploader.features.dragdrop );
  •  
  • }
  •  
  •  
  • // I handle the queue changed event.
  • function handlePluploadQueueChanged( uploader ) {
  •  
  • console.log( "Files added to queue." );
  •  
  • if ( uploader.files.length && isNotUploading() ){
  •  
  • uploader.start();
  •  
  • }
  •  
  • }
  •  
  •  
  • // I handle the change in state of the uploader.
  • function handlePluploadStateChanged( uploader ) {
  •  
  • if ( isUploading() ) {
  •  
  • dom.uploader.addClass( "uploading" );
  •  
  • } else {
  •  
  • dom.uploader.removeClass( "uploading" );
  •  
  • }
  •  
  • }
  •  
  •  
  • // I handle the upload progress event. This gives us the progress of the given
  • // file, NOT of the entire upload queue.
  • function handlePluploadUploadProgress( uploader, file ) {
  •  
  • console.info( "Upload progress:", file.percent );
  •  
  • dom.percent.text( file.percent );
  •  
  • }
  •  
  •  
  • // I determine if the upload is currently inactive.
  • function isNotUploading() {
  •  
  • var currentState = uploader.state;
  •  
  • return( currentState === plupload.STOPPED );
  •  
  • }
  •  
  •  
  • // I determine if the uploader is currently uploading a file (or if it is inactive).
  • function isUploading() {
  •  
  • var currentState = uploader.state;
  •  
  • return( currentState === plupload.STARTED );
  •  
  • }
  •  
  • })( jQuery, plupload );
  •  
  • </script>
  •  
  • </body>
  • </html>

On the server-side, I've tried to keep things as simple as possible. As the chunks arrive (always in serial-order), I'm concatenating them on disk. Once the last chunk has arrived, I simply move the transient file - now completely formed - into the "uploads" directory:

  • <cfscript>
  •  
  • // If the file is being uploaded as a single upload, the FORM post will contain the fields,
  • // "name" and "file," which will contain the contents of the upload.
  • // --
  • // If the file is being uploaded in chunks, then each FORM post will contain the fields,
  • // "chunks", "chunk", "name", and "file". In this case, the "file" contains the content
  • // of the current chunk.
  •  
  •  
  • // We are executing a normal upload (ie, the entire file at once).
  • if ( isNull( form.chunks ) ) {
  •  
  •  
  • fileExtension = listLast( form.name, "." );
  •  
  • fileMove(
  • form.file,
  • expandPath( "/uploads/#createUUID()#.#fileExtension#" )
  • );
  •  
  •  
  • // We are executing a chunked upload.
  • } else {
  •  
  •  
  • // Since we are dealing with chunks, instead of a full file, we'll be appending each
  • // chunk to the known file. However, for the demo, let's keep the transient file out
  • // of the uploads until the chunking has been completed.
  • upload = fileOpen( expandPath( "/chunks/#form.name#" ), "append" );
  •  
  • // Append the current chunk to the end of the transient file.
  • fileWrite( upload, fileReadBinary( form.file ) );
  • fileClose( upload );
  •  
  • // If this is the last of the chunks, the we can move the transient file to the
  • // completed uploads folder (with a unique name).
  • if ( form.chunk == ( form.chunks - 1 ) ) {
  •  
  • fileExtension = listLast( form.name, "." );
  •  
  • fileMove(
  • expandPath( "/chunks/#form.name#" ),
  • expandPath( "/uploads/#createUUID()#.#fileExtension#" )
  • );
  •  
  • }
  •  
  •  
  • }
  •  
  • </cfscript>
  •  
  • <!--- Reset the content buffer. --->
  • <cfcontent
  • type="text/plain"
  • variable="#charsetDecode( 'success', 'utf-8' )#"
  • />

This approach requires a little bit of extra work on the server, but you can see that it's not terribly complicated. And, in my experiment here, it has solved the problem of large uploads. The uploads that were hanging previously, now upload fine when chunking is enabled.

Note on Flash: The Flash runtime does allow chunking; however, it somewhat defeats its own purpose. According to the documentation, Flash cannot access the raw file source or streaming uploading. As such, Flash has to load the entire file into memory when it chunks. That said, loading the entire file into memory is no worse than having chunking turned off. But, in my brief experience, there does seem to be a performance penalty for chunking in Flash.




Reader Comments

Hi Ben,
I like the idea of chunking the upload but I have a question.

At the start of the article you wrote:

Plupload then uploads each one of these blobs with additional metadata about where it resides within the master file. It is then the responsibility of the server to take all of these binary chunks and rebuild the master file on the server.

But about your CF code you write:

As the chunks arrive (always in serial-order), I'm concatenating them on disk.

This seems to be a contradiction as the fact chunks include location information implies to me that chunks could arrive out of sequence.

This begs the question how would you (or can you) access the chunk ordering metadata so that the process of rebuilding the file could handle chunks arriving out of order?

Reply to this Comment

@Ben,

Less than an hour ago, you tweeted that there's more coming on this topic, so the following is offered in the spirit of helping, not nit-picking:

At line 20 of your index.cfm page, I see the typical doubling of pound signs, but at lines 48 and 49, I don't. I surmise that it was once an HTML file, not yet fully converted to CFML for the blog, or vice versa.

One of my coworkers is writing chunked file upload pages right now. The topic couldn't be any more timely for our site. Thanks.

Reply to this Comment

@Ben,

My second guess is, to reduce the complexity of index.cfm to the blog audience, you replaced sections of its listing with View Source.

DId I guess it? Did I guess it? Did I guess it? :-)

Reply to this Comment

@Steven,

That's an excellent question. In my experience, Plupload uploads everything in order - meaning that it doesn't start one upload until the previous one has finished. The chunking seems to follow the same approach; which is why I was concatenating them.

When I first approached the problem, I was actually writing them to disk individually like:

* my_image.jpg.0
* my_image.jpg.1
* my_image.jpg.2
* my_image.jpg.3

... and then when all the chunks were accounted for (based on chunk == (chunks - 1) assumption), I would open a new file with "Append" mode and just append a fileReadBinary() of each saved chunk.

That said, I was _still_ making the assumption that the last chunk is the last one uploaded. If they were to come out of order, you'd have to build more logic to keep track of the various parts, which would be irksome :)

Reply to this Comment

@WebManWalking,

Ugg, good catch! These more complex demos get posted on GitHub. And, when I can, I try to keep them as all HTM files so they can be viewed under the special "gh-pages" branch. So, I've had to convert back and forth between HTM and CFM for a few of the Plupload demos. I try to keep them clean :) But, it's been a looong week.

Reply to this Comment

HI Ben,

Great exemple : it's awsome, thank you very much.

Now : How can I drop an entire folder (with sub-folder) and keep all the information on the server.

Reply to this Comment

I have plupload implemented in my website and facing an issue while uploading files more than 200MB.

I did try the solution above, i.e using chunks, but somehow it does not seem to be working.

The error I get is "Element CHUNKS is undefined in FORM."

Please let me know if there is any specific setting which needs to be done.
Settings used:
$("#uploader").pluploadQueue({
runtimes: 'html5,flash,silverlight,html4',
url: addFileURL,
chunk_size: '200kb',
max_retries: 3,
rename: true,
dragdrop: true,
container:'iteattachments',
multipart: true,
filters: [ {title : "All file types", extensions : "*"} ],
flash_swf_url: '/includes/javascript/plupload/Moxie.swf',
silverlight_xap_url: '/includes/javascript/plupload/Moxie.xap'
});

Thanks!

Reply to this Comment

Hello Ben, great post (not a formality, but for real!) Now, I was wondering how to set the upload directory based on a variable or, customerID, but dynamically.

I'm trying to figure out where would this been setup.

Thank you!

Reply to this Comment

@Dani,

There's two places you can do it. On the client-side or on the server-side. On the client, Plupload allows you to hook into the "BeforeUpload" event. In this event, you can completely change the settings that Plupload will use for the upload that is *about to happen*. If you look in my other post, about chunking with S3:

http://www.bennadel.com/blog/2586-chunking-amazon-s3-file-uploads-with-plupload-and-coldfusion.htm

... you'll see that I can completely change the settings for each file. In the Amazon S3 context, I'm defining a resource that is based on the UUID that Plupload provides.

THAT said, I *wouldn't advocate* doing that for your situation. You should probably remove as much of the configuration from the client-side as possible. If your client-side code lets you define upload locations, it *may* open you up to security vulnerability. Imagine if someone were to tamper with that value before the upload occurred. In such a scenario, a user might be able to upload a malicious file to a "known" location (because they altered your client-side code) and then invoke that code with an additional HTTP request.

The safer approach would be to upload *every* file to a safe, non-web-accessible location first (such as a temp directory). Then, on the server side, check the user's preferences / account / database record /etc., to see where the uploaded file should be *moved* to after the upload is complete.

This way, you can let the final storage be based on the particular user without opening yourself up to malicious behavior. Of course, this also requires that have a server-side repository for user settings / preferences AND are able to identify the user in each request (such as through session cookies).

I hope that helps a bit.

Reply to this Comment

Does this work if 2 browsers simultaneously upload the same file? Will the 2 files get saved separately and correctly?

Reply to this Comment

Hello, Ben
Great tutorial!
I use IE9 flash runtime chunking upload and facing with problem when i try to upload big file (300mb (chunk-size 5mb)) flash first loads 300mb in memory but then with every chunk memory continue grow till 1GB. And after file has been uploaded it never garbage collected untill manually call uploader.destroy(). Is any solution for this?

Reply to this Comment

Post A Comment

You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.