Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at CFUNITED 2010 (Landsdown, VA) with: Ellen Kaspern

Chunking Amazon S3 File Uploads With Plupload And ColdFusion

By Ben Nadel on

Earlier this week, I took a look at using Plupload to chunk file uploads to your ColdFusion server. When I saw that working, I got excited; I wondered if I could use the same approach to chunk uploads directly to Amazon S3? Several evenings of R&D later, I have it working (mostly)! The principle is the same, but the implementation is necessarily more complex due to the communication with the Amazon S3 server.


 
 
 

 
  
 
 
 

View Plupload S3 Chunk Project on my GitHub account.

Last year, I demonstrated that you could POST files directly to Amazon S3 using a regular form POST. Then, I demonstrated that you could use Plupload to upload files directly to Amazon S3. But, both of those approaches required that the entire file be uploaded in one shot. Chunking requires us to break down a single file into multiple chunks (blobs), post them individually, and then rebuild the master file on the target server (in this case, Amazon S3).

Chunking files up to Amazon S3 has a few limitations. To start, the chunks have to be at least 5MB in size (for some reason). If you attempt to combine chunks smaller than 5MB, Amazon will reject the request. This means that our Plupload instance will have to conditionally apply chunking to each file, in turn, depending on its size.

We also have to deal with the fact that Plupload can't initiate a Multi Upload action on Amazon S3 (it would require exposing our secret key). As such, we have to treat each chunk like it's a "real file" and upload it according to our POST "Policy." This doesn't really complicate things all that much since Amazon provides a way to copy-merge files when configuring a multi upload action. But, initiating, configuring, and then finalizing the multi upload action requires a lot of grunt work that we have to perform on our ColdFusion server.

Anyway, here's the main page for the demo. The bulk of the logic is in the BeforeUpload handler where we have to configure Plupload on a per-file basis. Part of what makes Plupload so full of sexy awesome-sauce is the fact that you can configure it on a per-file basis.

  • <cfscript>
  •  
  • // Include the Amazon Web Service (AWS) S3 credentials.
  • include "aws-credentials.cfm";
  •  
  • // Include some utility methods to make the AWS interactions easier.
  • include "udf.cfm";
  •  
  • // The expiration must defined in UCT time. Since the Plupload widget may be on the
  • // screen for a good amount of time, especially if this is a single-page app, we
  • // probably need to put the expiration date into the future a good amount.
  • expiration = dateConvert( "local2utc", dateAdd( "d", 1, now() ) );
  •  
  • // NOTE: When formatting the UTC time, the hours must be in 24-hour time; therefore,
  • // make sure to use "HH", not "hh" so that your policy don't expire prematurely.
  • // ---
  • // NOTE: We are providing a success_action_status INSTEAD of a success_action_redirect
  • // since we don't want the browser to try and redirect (won't be supported across all
  • // Plupload environments). Instead, we'll get Amazon S3 to return the XML document
  • // for the successful upload. Then, we can parse the response locally.
  • policy = {
  • "expiration" = (
  • dateFormat( expiration, "yyyy-mm-dd" ) & "T" &
  • timeFormat( expiration, "HH:mm:ss" ) & "Z"
  • ),
  • "conditions" = [
  • {
  • "bucket" = aws.bucket
  • },
  • {
  • "acl" = "private"
  • },
  • {
  • "success_action_status" = "2xx"
  • },
  • [ "starts-with", "$key", "pluploads/" ],
  • [ "starts-with", "$Content-Type", "image/" ],
  • [ "content-length-range", 0, 10485760 ], // 10mb
  •  
  • // The following keys are ones that Plupload will inject into the form-post
  • // across the various environments.
  • // --
  • // NOTE: If we do NOT chunk the file, we have to manually inject the "chunk"
  • // and "chunks" keys in order to conform to the policy.
  • [ "starts-with", "$Filename", "pluploads/" ],
  • [ "starts-with", "$name", "" ],
  • [ "starts-with", "$chunk", "" ],
  • [ "starts-with", "$chunks", "" ]
  • ]
  • };
  •  
  •  
  • // ------------------------------------------------------ //
  • // ------------------------------------------------------ //
  •  
  •  
  • // The policy will be posted along with the FORM post as a hidden form field.
  • // Serialize it as JavaScript Object notation.
  • serializedPolicy = serializeJson( policy );
  •  
  • // When the policy is being serialized, ColdFusion will try to turn "201" into the
  • // number 201. However, we NEED this value to be a STRING. As such, we'll give the
  • // policy a non-numeric value and then convert it to the appropriate 201 after
  • // serialization.
  • serializedPolicy = replace( serializedPolicy, "2xx", "201" );
  •  
  • // Remove up the line breaks.
  • serializedPolicy = reReplace( serializedPolicy, "[\r\n]+", "", "all" );
  •  
  • // Encode the policy as Base64 so that it doesn't mess up the form post data at all.
  • encodedPolicy = binaryEncode(
  • charsetDecode( serializedPolicy, "utf-8" ) ,
  • "base64"
  • );
  •  
  •  
  • // ------------------------------------------------------ //
  • // ------------------------------------------------------ //
  •  
  •  
  • // To make sure that no one tampers with the FORM POST, create hashed message
  • // authentication code of the policy content.
  • encodedSignature = hmacSha1( encodedPolicy, aws.secretKey, "base64" );
  •  
  • </cfscript>
  •  
  • <!--- Reset the output buffer. --->
  • <cfcontent type="text/html; charset=utf-8" />
  •  
  • <!doctype html>
  • <html>
  • <head>
  • <meta charset="utf-8" />
  •  
  • <title>
  • Chunking Amazon S3 File Uploads With Plupload And ColdFusion
  • </title>
  •  
  • <link rel="stylesheet" type="text/css" href="./assets/css/styles.css"></link>
  • </head>
  • <body>
  •  
  • <h1>
  • Chunking Amazon S3 File Uploads With Plupload And ColdFusion
  • </h1>
  •  
  • <div id="uploader" class="uploader">
  •  
  • <a id="selectFiles" href="##">
  •  
  • <span class="label">
  • Select Files
  • </span>
  •  
  • <span class="standby">
  • Waiting for files...
  • </span>
  •  
  • <span class="progress">
  • Uploading - <span class="percent"></span>%
  • </span>
  •  
  • </a>
  •  
  • </div>
  •  
  • <div class="uploads">
  • <!-- To be populated with uploads via JavaScript. -->
  • </div>
  •  
  •  
  • <!-- Load and initialize scripts. -->
  • <script type="text/javascript" src="./assets/jquery/jquery-2.1.0.min.js"></script>
  • <script type="text/javascript" src="./assets/plupload/js/plupload.full.min.js"></script>
  • <script type="text/javascript">
  •  
  • (function( $, plupload ) {
  •  
  • // Find and cache the DOM elements we'll be using.
  • var dom = {
  • uploader: $( "#uploader" ),
  • percent: $( "#uploader span.percent" ),
  • uploads: $( "div.uploads" )
  • };
  •  
  •  
  • // Instantiate the Plupload uploader. When we do this, we have to pass in
  • // all of the data that the Amazon S3 policy is going to be expecting.
  • // Also, we have to pass in the policy :)
  • var uploader = new plupload.Uploader({
  •  
  • // Try to load the HTML5 engine and then, if that's not supported, the
  • // Flash fallback engine.
  • // --
  • // NOTE: For Flash to work, you will have to upload the crossdomain.xml
  • // file to the root of your Amazon S3 bucket. Furthermore, chunking is
  • // sort of available in Flash, but its not that great.
  • runtimes: "html5,flash",
  •  
  • // The upload URL - our Amazon S3 bucket.
  • url: <cfoutput>"http://#aws.bucket#.s3.amazonaws.com/"</cfoutput>,
  •  
  • // The ID of the drop-zone element.
  • drop_element: "uploader",
  •  
  • // For the Flash engine, we have to define the ID of the node into which
  • // Pluploader will inject the <OBJECT> tag for the flash movie.
  • container: "uploader",
  •  
  • // To enable click-to-select-files, you can provide a browse button. We
  • // can use the same one as the drop zone.
  • browse_button: "selectFiles",
  •  
  • // The URL for the SWF file for the Flash upload engine for browsers that
  • // don't support HTML5.
  • flash_swf_url: "./assets/plupload/js/Moxie.swf",
  •  
  • // Needed for the Flash environment to work.
  • urlstream_upload: true,
  •  
  • // NOTE: Unique names doesn't work with Amazon S3 and Plupload - see the
  • // BeforeUpload event to see how we can generate unique file names.
  • // --
  • // unique_names: true,
  •  
  • // The name of the form-field that will hold the upload data. Amason S3
  • // will expect this form field to be called, "file".
  • file_data_name: "file",
  •  
  • // This defines the maximum size that each file chunk can be. However,
  • // since Amazon S3 cannot handle multipart uploads smaller than 5MB, we'll
  • // actually defer the setting of this value to the BeforeUpload at which
  • // point we'll have more information.
  • // --
  • // chunk_size: "5mb", // 5242880 bytes.
  •  
  • // If the upload of a chunk fails, this is the number of times the chunk
  • // should be re-uploaded before the upload (overall) is considered a
  • // failure.
  • max_retries: 3,
  •  
  • // Send any additional params (ie, multipart_params) in multipart message
  • // format.
  • multipart: true,
  •  
  • // Pass through all the values needed by the Policy and the authentication
  • // of the request.
  • // --
  • // NOTE: We are using the special value, ${filename} in our param
  • // definitions; but, we are actually overriding these in the BeforeUpload
  • // event. This notation is used when you do NOT know the name of the file
  • // that is about to be uploaded (and therefore cannot define it explicitly).
  • multipart_params: {
  • "acl": "private",
  • "success_action_status": "201",
  • "key": "pluploads/${filename}",
  • "Filename": "pluploads/${filename}",
  • "Content-Type": "image/*",
  • "AWSAccessKeyId" : <cfoutput>"#aws.accessID#"</cfoutput>,
  • "policy": <cfoutput>"#encodedPolicy#"</cfoutput>,
  • "signature": <cfoutput>"#encodedSignature#"</cfoutput>
  • }
  •  
  • });
  •  
  • // Set up the event handlers for the uploader.
  • uploader.bind( "Init", handlePluploadInit );
  • uploader.bind( "Error", handlePluploadError );
  • uploader.bind( "FilesAdded", handlePluploadFilesAdded );
  • uploader.bind( "QueueChanged", handlePluploadQueueChanged );
  • uploader.bind( "BeforeUpload", handlePluploadBeforeUpload );
  • uploader.bind( "UploadProgress", handlePluploadUploadProgress );
  • uploader.bind( "ChunkUploaded", handlePluploadChunkUploaded );
  • uploader.bind( "FileUploaded", handlePluploadFileUploaded );
  • uploader.bind( "StateChanged", handlePluploadStateChanged );
  •  
  • // Initialize the uploader (it is only after the initialization is complete that
  • // we will know which runtime load: html5 vs. Flash).
  • uploader.init();
  •  
  •  
  • // ------------------------------------------ //
  • // ------------------------------------------ //
  •  
  •  
  • // I handle the before upload event where the settings and the meta data can
  • // be edited right before the upload of a specific file, allowing for per-
  • // file settings. In this case, this allows us to determine if given file
  • // needs to br (or can be) chunk-uploaded up to Amazon S3.
  • function handlePluploadBeforeUpload( uploader, file ) {
  •  
  • console.log( "File upload about to start.", file.name );
  •  
  • // Track the chunking status of the file (for the success handler). With
  • // Amazon S3, we can only chunk files if the leading chunks are at least
  • // 5MB in size.
  • file.isChunked = isFileSizeChunkableOnS3( file.size );
  •  
  • // Generate the "unique" key for the Amazon S3 bucket based on the
  • // non-colliding Plupload ID. If we need to chunk this file, we'll create
  • // an additional key below. Note that this is the file we want to create
  • // eventually, NOT the chunk keys.
  • file.s3Key = ( "pluploads/" + file.id + "/" + file.name );
  •  
  • // This file can be chunked on S3 - at least 5MB in size.
  • if ( file.isChunked ) {
  •  
  • // Since this file is going to be chunked, we'll need to update the
  • // chunk index every time a chunk is uploaded. We'll start it at zero
  • // and then increment it on each successful chunk upload.
  • file.chunkIndex = 0;
  •  
  • // Create the chunk-based S3 resource by appending the chunk index.
  • file.chunkKey = ( file.s3Key + "." + file.chunkIndex );
  •  
  • // Define the chunk size - this is what tells Plupload that the file
  • // should be chunked. In this case, we are using 5MB because anything
  • // smaller will be rejected by S3 later when we try to combine them.
  • // --
  • // NOTE: Once the Plupload settings are defined, we can't just use the
  • // specialized size values - we actually have to pass in the parsed
  • // value (which is just the byte-size of the chunk).
  • uploader.settings.chunk_size = plupload.parseSize( "5mb" );
  •  
  • // Since we're chunking the file, Plupload will take care of the
  • // chunking. As such, delete any artifacts from our non-chunked
  • // uploads (see ELSE statement).
  • delete( uploader.settings.multipart_params.chunks );
  • delete( uploader.settings.multipart_params.chunk );
  •  
  • // Update the Key and Filename so that Amazon S3 will store the
  • // CHUNK resource at the correct location.
  • uploader.settings.multipart_params.key = file.chunkKey;
  • uploader.settings.multipart_params.Filename = file.chunkKey;
  •  
  • // This file CANNOT be chunked on S3 - it's not large enough for S3's
  • // multi-upload resource constraints
  • } else {
  •  
  • // Remove the chunk size from the settings - this is what tells
  • // Plupload that this file should NOT be chunked (ie, that it should
  • // be uploaded as a single POST).
  • uploader.settings.chunk_size = 0;
  •  
  • // That said, in order to keep with the generated S3 policy, we still
  • // need to have the chunk "keys" in the POST. As such, we'll append
  • // them as additional multi-part parameters.
  • uploader.settings.multipart_params.chunks = 0;
  • uploader.settings.multipart_params.chunk = 0;
  •  
  • // Update the Key and Filename so that Amazon S3 will store the
  • // base resource at the correct location.
  • uploader.settings.multipart_params.key = file.s3Key;
  • uploader.settings.multipart_params.Filename = file.s3Key;
  •  
  • }
  •  
  • }
  •  
  •  
  • // I handle the successful upload of one of the chunks (of a larger file).
  • function handlePluploadChunkUploaded( uploader, file, info ) {
  •  
  • console.log( "Chunk uploaded.", info.offset, "of", info.total, "bytes." );
  •  
  • // As the chunks are uploaded, we need to change the target location of
  • // the next chunk on Amazon S3. As such, we'll pre-increment the chunk
  • // index and then update the storage keys.
  • file.chunkKey = ( file.s3Key + "." + ++file.chunkIndex );
  •  
  • // Update the Amazon S3 chunk keys. By changing them here, Plupload will
  • // automatically pick up the changes and apply them to the next chunk that
  • // it uploads.
  • uploader.settings.multipart_params.key = file.chunkKey;
  • uploader.settings.multipart_params.Filename = file.chunkKey;
  •  
  • }
  •  
  •  
  • // I handle any errors raised during uploads.
  • function handlePluploadError() {
  •  
  • console.warn( "Error during upload." );
  •  
  • }
  •  
  •  
  • // I handle the files-added event. This is different that the queue-
  • // changed event. At this point, we have an opportunity to reject files
  • // from the queue.
  • function handlePluploadFilesAdded( uploader, files ) {
  •  
  • console.log( "Files selected." );
  •  
  • // NOTE: The demo calls for images; however, I'm NOT regulating that in
  • // code - trying to keep things smaller.
  • // --
  • // Example: file.splice( 0, 1 ).
  •  
  • }
  •  
  •  
  • // I handle the successful upload of a whole file. Even if a file is chunked,
  • // this handler will be called with the same response provided to the last
  • // chunk success handler.
  • function handlePluploadFileUploaded( uploader, file, response ) {
  •  
  • console.log( "Entire file uploaded.", response );
  •  
  • var img = $( "<img>" )
  • .prependTo( dom.uploads )
  • ;
  •  
  • // If the file was chunked, target the CHUNKED success file in order to
  • // initiate the rebuilding of the master file on Amazon S3.
  • if ( file.isChunked ) {
  •  
  • img.prop( "src", "./success_chunked.cfm?baseKey=" + encodeURIComponent( file.s3Key ) + "&chunks=" + file.chunkIndex );
  •  
  • } else {
  •  
  • img.prop( "src", "./success_not_chunked.cfm?baseKey=" + encodeURIComponent( file.s3Key ) );
  •  
  • }
  •  
  • }
  •  
  •  
  • // I handle the init event. At this point, we will know which runtime has loaded,
  • // and whether or not drag-drop functionality is supported.
  • function handlePluploadInit( uploader, params ) {
  •  
  • console.log( "Initialization complete." );
  • console.info( "Drag-drop supported:", !! uploader.features.dragdrop );
  •  
  • }
  •  
  •  
  • // I handle the queue changed event.
  • function handlePluploadQueueChanged( uploader ) {
  •  
  • console.log( "Files added to queue." );
  •  
  • if ( uploader.files.length && isNotUploading() ){
  •  
  • uploader.start();
  •  
  • }
  •  
  • }
  •  
  •  
  • // I handle the change in state of the uploader.
  • function handlePluploadStateChanged( uploader ) {
  •  
  • if ( isUploading() ) {
  •  
  • dom.uploader.addClass( "uploading" );
  •  
  • } else {
  •  
  • dom.uploader.removeClass( "uploading" );
  •  
  • }
  •  
  • }
  •  
  •  
  • // I handle the upload progress event. This gives us the progress of the given
  • // file, NOT of the entire upload queue.
  • function handlePluploadUploadProgress( uploader, file ) {
  •  
  • console.info( "Upload progress:", file.percent );
  •  
  • dom.percent.text( file.percent );
  •  
  • }
  •  
  •  
  • // I determine if the given file size (in bytes) is large enough to allow
  • // for chunking on Amazon S3 (which requires each chunk by the last to be a
  • // minimum of 5MB in size).
  • function isFileSizeChunkableOnS3( fileSize ) {
  •  
  • var KB = 1024;
  • var MB = ( KB * 1024 );
  • var minSize = ( MB * 5 );
  •  
  • return( fileSize > minSize );
  •  
  • }
  •  
  •  
  • // I determine if the upload is currently inactive.
  • function isNotUploading() {
  •  
  • var currentState = uploader.state;
  •  
  • return( currentState === plupload.STOPPED );
  •  
  • }
  •  
  •  
  • // I determine if the uploader is currently uploading a file (or if it is inactive).
  • function isUploading() {
  •  
  • var currentState = uploader.state;
  •  
  • return( currentState === plupload.STARTED );
  •  
  • }
  •  
  • })( jQuery, plupload );
  •  
  • </script>
  •  
  • </body>
  • </html>

Depending on the size of the file (under or over 5MB), the file is either uploaded in its entirety; or, it's uploaded in chunks. If the file is uploaded in its entirety, there's nothing for us do, so we just send the user to a simple "success" page. If the file is chunked, however, we have to communicate with Amazon S3 (using ColdFusion) in order to rebuild the master file from the chunks. That's the complicated success page:

  • <cfscript>
  •  
  • // Include the Amazon Web Service (AWS) S3 credentials.
  • include "aws-credentials.cfm";
  •  
  • // Include some utility methods to make the AWS interactions easier.
  • include "udf.cfm";
  •  
  •  
  • // ------------------------------------------------------ //
  • // ------------------------------------------------------ //
  •  
  •  
  • // When the chunks have all been uploaded to the Amazon S3 bucket, we need to know
  • // the base resource URL (ie, the parts before the .0, .1, .2, etc) and the number
  • // of chunks that were uploaded. All of the chunks are going to be merged together
  • // to re-create the master file on S3.
  • // --
  • // NOTE: This values will NOT start with a leading slash.
  • param name="url.baseKey" type="string";
  •  
  • // I am the number of chunks used to POST the file.
  • param name="url.chunks" type="numeric";
  •  
  • // Since the key may have characters that required url-encoding, we have to re-encode
  • // the key or our signature may not match.
  • urlEncodedKey = urlEncodedFormat( url.baseKey );
  •  
  • // When we rebuild the master file from the chunks, we'll have to make a number of
  • // requests to the file we want to create, PLUS some stuff. Let's create the base
  • // resource on which we can build.
  • baseResource = ( "/" & aws.bucket & "/" & urlEncodedKey );
  •  
  •  
  • // ------------------------------------------------------ //
  • // ------------------------------------------------------ //
  •  
  •  
  • // Now, we need to initiate the re-creation of the master file. This, unfortunately,
  • // is going to require a number of steps. We are going to leverage the "Multipart
  • // Upload" S3 feature which allows a single file to be uploaded across several
  • // different HTTP requests; however, since our "chunks" are already on S3, we're
  • // going to use the "Upload Part - Copy" action to consume the chunks that we already
  • // uploaded.
  •  
  • // First, we need to initiate the multipart upload and get our unique upload ID. To
  • // do this, we have to append "?uploads" to the base resource (ie, the resource we
  • // want to create).
  • resource = ( baseResource & "?uploads" );
  •  
  • // A timestamp is required for all authenticated requests.
  • currentTime = getHttpTimeString( now() );
  •  
  • signature = generateSignature(
  • secretKey = aws.secretKey,
  • method = "POST",
  • createdAt = currentTime,
  • resource = resource
  • );
  •  
  • // Send request to Amazon S3.
  • initiateUpload = new Http(
  • method = "post",
  • url = "https://s3.amazonaws.com#resource#"
  • );
  •  
  • initiateUpload.addParam(
  • type = "header",
  • name = "authorization",
  • value = "AWS #aws.accessID#:#signature#"
  • );
  •  
  • initiateUpload.addParam(
  • type = "header",
  • name = "date",
  • value = currentTime
  • );
  •  
  • // The response comes back as XML.
  • response = xmlParse( initiateUpload.send().getPrefix().fileContent );
  •  
  • // ... we need to extract the "uploadId" value.
  • uploadID = xmlSearch( response, "string( //*[ local-name() = 'UploadId' ] )" );
  •  
  •  
  • // ------------------------------------------------------ //
  • // ------------------------------------------------------ //
  •  
  •  
  • // Now that we have the our upload ID, we can begin to build up the master file, one
  • // chunk at a time. When we do this, we have to make sure to keep track of the ETag
  • // returned from Amazon - we'll need it to finalize the upload in the next step.
  • etags = [];
  •  
  • // As we are looping over the chunks, note that the Plupload / JavaScript chunks are
  • // ZERO-based; but, the Amazon parts are ONE-based.
  • // --
  • // NOTE:
  • for ( i = 0 ; i < url.chunks ; i++ ) {
  •  
  • // Indicate which chunk we're dealing with.
  • resource = ( baseResource & "?partNumber=#( i + 1 )#&uploadId=#uploadID#" );
  •  
  • // A timestamp is required for all authenticated requests.
  • currentTime = getHttpTimeString( now() );
  •  
  • // Notice that we're using the "x-amz-copy-source" header. This tells Amazon S3
  • // that the incoming chunk ALREADY resides on S3. And, in fact, we're providing
  • // the key to the Plupload chunk.
  • signature = generateSignature(
  • secretKey = aws.secretKey,
  • method = "PUT",
  • createdAt = currentTime,
  • amazonHeaders = [
  • "x-amz-copy-source:#baseResource#.#i#"
  • ],
  • resource = resource
  • );
  •  
  • // Send request to Amazon S3.
  • initiateUpload = new Http(
  • method = "put",
  • url = "https://s3.amazonaws.com#resource#"
  • );
  •  
  • initiateUpload.addParam(
  • type = "header",
  • name = "authorization",
  • value = "AWS #aws.accessID#:#signature#"
  • );
  •  
  • initiateUpload.addParam(
  • type = "header",
  • name = "date",
  • value = currentTime
  • );
  •  
  • initiateUpload.addParam(
  • type = "header",
  • name = "x-amz-copy-source",
  • value = "#baseResource#.#i#"
  • );
  •  
  • // The response comes back as XML.
  • response = xmlParse( initiateUpload.send().getPrefix().fileContent );
  •  
  • // ... we need to extract the ETag value.
  • etag = xmlSearch( response, "string( //*[ local-name() = 'ETag' ] )" );
  •  
  • arrayAppend( etags, etag );
  •  
  • }
  •  
  •  
  • // ------------------------------------------------------ //
  • // ------------------------------------------------------ //
  •  
  •  
  • // Now that we have told Amazon S3 about the location of the chunks that we uploaded,
  • // we can finalize the multi-part upload. When doing this, Amazon S3 will concatenate
  • // all of the chunks back into the master file.
  •  
  • // NOTE: Parts are one-based (not zero-based).
  • xml = [ "<CompleteMultipartUpload>" ];
  •  
  • for ( i = 0 ; i < url.chunks ; i++ ) {
  •  
  • arrayAppend(
  • xml,
  • "<Part>" &
  • "<PartNumber>#( i + 1 )#</PartNumber>" &
  • "<ETag>#etags[ i + 1 ]#</ETag>" &
  • "</Part>"
  • );
  •  
  • }
  •  
  • arrayAppend( xml, "</CompleteMultipartUpload>" );
  •  
  • body = arrayToList( xml, chr( 10 ) );
  •  
  • // Define the resource that Amazon S3 should construct with the given upload ID.
  • resource = ( baseResource & "?uploadId=#uploadID#" );
  •  
  • // A timestamp is required for all authenticated requests.
  • currentTime = getHttpTimeString( now() );
  •  
  • signature = generateSignature(
  • secretKey = aws.secretKey,
  • method = "POST",
  • createdAt = currentTime,
  • resource = resource
  • );
  •  
  • // Send request to Amazon S3.
  • finalizeUpload = new Http(
  • method = "post",
  • url = "https://s3.amazonaws.com#resource#"
  • );
  •  
  • finalizeUpload.addParam(
  • type = "header",
  • name = "authorization",
  • value = "AWS #aws.accessID#:#signature#"
  • );
  •  
  • finalizeUpload.addParam(
  • type = "header",
  • name = "date",
  • value = currentTime
  • );
  •  
  • finalizeUpload.addParam(
  • type = "header",
  • name = "content-length",
  • value = len( body )
  • );
  •  
  • finalizeUpload.addParam(
  • type = "body",
  • value = body
  • );
  •  
  • response = finalizeUpload.send().getPrefix();
  •  
  • // Make sure that if the build failed, then we don't delete the chunk files. IE,
  • // make sure we don't get to the next step.
  • if ( ! reFind( "^2\d\d", response.statusCode ) ) {
  •  
  • throw( type = "BuildFailed" );
  •  
  • }
  •  
  •  
  • // ------------------------------------------------------ //
  • // ------------------------------------------------------ //
  •  
  •  
  • // At this point, we have collected the chunk files and merged them back into a
  • // mastder file on Amazon S3. Now, we can safely delete the chunk files. For this,
  • // I'm using Amazon S3's multi-object delete.
  • // --
  • // NOTE: I CANNOT ACTUALLY GET THIS TO WORK. THE RESPONSE SEEMS TO INDICATE THAT
  • // THE OBJECTS WHERE DELETED; HOWEVER, WHEN I BROWSE MY BUCKET, THE OBJECTS STILL
  • // SEEM TO EXIST AND CAN BE ACCESSED (BY ME). NO MATTER WHAT COMBINATION OF <Key>
  • // VALUES I TRIED, NOTHING SEEMED TO WORK. AHHHHHHHH!!!!!!!
  • // --
  • xml = [ "<Delete>" ];
  •  
  • // Add each object key to the delete request.
  • for ( i = 0 ; i < url.chunks ; i++ ) {
  •  
  • arrayAppend(
  • xml,
  • "<Object>" &
  • "<Key>/#aws.bucket#/#url.baseKey#.#i#</Key>" &
  • "</Object>"
  • );
  •  
  • }
  •  
  • arrayAppend( xml, "</Delete>" );
  •  
  • body = arrayToList( xml, chr( 10 ) );
  •  
  • // Generate the MD5 hash to ensure the integirty of the request.
  • md5Hash = generateMd5Hash( body );
  •  
  • // A timestamp is required for all authenticated requests.
  • currentTime = getHttpTimeString( now() );
  •  
  • // For multi-object delete, the resource simply points to our bucket.
  • resource = ( "/" & aws.bucket & "/?delete" );
  •  
  • signature = generateSignature(
  • secretKey = aws.secretKey,
  • method = "POST",
  • md5Hash = md5Hash,
  • contentType = "application/xml",
  • createdAt = currentTime,
  • resource = resource
  • );
  •  
  • // Send request to Amazon S3.
  • deleteChunks = new Http(
  • method = "post",
  • url = "https://s3.amazonaws.com#resource#"
  • );
  •  
  • deleteChunks.addParam(
  • type = "header",
  • name = "authorization",
  • value = "AWS #aws.accessID#:#signature#"
  • );
  •  
  • deleteChunks.addParam(
  • type = "header",
  • name = "date",
  • value = currentTime
  • );
  •  
  • deleteChunks.addParam(
  • type = "header",
  • name = "content-type",
  • value = "application/xml"
  • );
  •  
  • deleteChunks.addParam(
  • type = "header",
  • name = "content-length",
  • value = len( body )
  • );
  •  
  • deleteChunks.addParam(
  • type = "header",
  • name = "content-md5",
  • value = md5Hash
  • );
  •  
  • deleteChunks.addParam(
  • type = "header",
  • name = "accept",
  • value = "*/*"
  • );
  •  
  • deleteChunks.addParam(
  • type = "body",
  • value = body
  • );
  •  
  • response = deleteChunks.send().getPrefix();
  •  
  •  
  • // ------------------------------------------------------ //
  • // ------------------------------------------------------ //
  •  
  •  
  • // Now that we've re-created the master file and cleaned up (supposedly) after
  • // ourselves, we can forward the user to a pre-signed URL of the file on S3.
  •  
  • // The URL will only be valid for a short amount of time.
  • nowInSeconds = fix( getTickCount() / 1000 );
  •  
  • // Add 10 seconds.
  • expirationInSeconds = ( nowInSeconds + 20 );
  •  
  • // Sign the request.
  • signature = generateSignature(
  • secretKey = aws.secretKey,
  • method = "GET",
  • expiresAt = expirationInSeconds,
  • resource = baseResource
  • );
  •  
  • // Prepare the signature for use in a URL (to make sure none of the characters get
  • // transported improperly).
  • urlEncodedSignature = urlEncodedFormat( signature );
  •  
  •  
  • // ------------------------------------------------------ //
  • // ------------------------------------------------------ //
  •  
  •  
  • // Direct to the pre-signed URL.
  • location(
  • url = "https://s3.amazonaws.com#baseResource#?AWSAccessKeyId=#aws.accessID#&Expires=#expirationInSeconds#&Signature=#urlEncodedSignature#",
  • addToken = false
  • );
  •  
  • </cfscript>

I won't explain the code much, since I don't have a lot of time. But, I will say that I couldn't actually get the chunk files to be removed with the multi-object delete action. I probably am not providing the right keys (though no combination seemed to work). As such, the master file was rebuilt, along side the chunk files.

Anyway, just a quick (or rather week-long) experiment with using Plupload to chunk file uploads directly to Amazon S3. I know I've said this before, but the more I learn about Plupload, the more in love I fall.




Reader Comments

This is really cool, Ben. Great work, and thanks for sharing it.

Personally I've been searching for a way to allow authenticated users to upload chunked audio and video files directly to S3, so this post is very valuable to me.

Also, I too am interested in how to mass-delete the chunks following the re-stitching of the chunks back into a master file. Seems that you'd have to delete each chunk individually after the master file has been created, and this may mean many subsequent requests to delete each chunk. I suppose you could delete the chunks one at a time during the re-stitching process.

For you and/or your readers, here's a tool that I've found useful for uploading from a ColdFusion script to S3. It might provide some ideas to make further enhancements:

Amazon S3 REST Wrapper
Author: Joe Danziger
http://amazons3.riaforge.org/

Reply to this Comment

@Ben:

I didn't see the file "aws-credentials.cfm" among the code that you posted. Would you mind posting that file with at least some placeholder values?

Reply to this Comment

@Jason,

Sorry, the aws-credentials is in the .gitignore (for obvious reasons). It just contains a struct with your AWS info:

  • aws = {
  • "bucket" = "your.bucket.name",
  • "accessID" = "XXXXX",
  • "secretKey" = "XXXXX"
  • };

Hope that helps.

As far as the deleting, I tried to get the bulk delete to work - to delete ALL the parts after the stitching; but, something is funky about the bulk-delete. It's like it doesn't actually delete the files - it just pushes a "null" pointer onto the versionining system. It's a part of S3 that I don't fully understand yet.

Reply to this Comment

Hi Ben, how do you do?

I canĀ“t speak english but I need your help.

I need to put a caption field along with the images, I managed to put the field, but at the time of submit the field caption won't.

how I can make the field go along on submit?

Thank you.

Reply to this Comment

Post A Comment

You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.