Skip to main content
Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.

Using Both STORED And DEFLATED Compression Methods With ZipOutputStream In Lucee CFML 5.3.7.47

By Ben Nadel on
Tags: ColdFusion

In yesterday's post about generating and incrementally streaming a Zip archive in Lucee CFML, I used the default compression method - DEFLATED - in the ZipOutputStream class. However, as I've discussed in the past, "deflating" images within a Zip archive can be a waste of CPU since most images are already compressed. As such, I wanted to quickly revisit the use of the ZipOutputStream, but try to archive images within the Zip using the STORED (ie, uncompressed) method in Lucee CFML 5.3.7.47.

When using the DEFLATED method, all you have to due is create the ZipEntry class and then write the binary content to it. When using the STORED method, on the other hand, it appears that you have to provide a bit more information. This wasn't well documented in the JavaDocs; but, based on trial-and-error, it seems as those we need to explicitly provide both the size and the CRC-32 (content checksum) when using the STORED method.

To try this out for myself, I revamped yesterday's demo to download the images in parallel and then write them to the Zip file using either method - DEFLATED or STORED - based on a URL query-string parameter. I've also updated the demo to keep track of how long the compression takes so we can see if there is any performance difference.

NOTE: To keep things simple, I've removed the "incrementally streaming" portion of this demo. Now, I'm just creating the Zip archive in-memory and then serving up the binary variable using the CFContent tag.

<cfscript>

	// Zip compression method: STORED or DEFLATED.
	param name="url.method" type="string" default="DEFLATED";

	// To try out the different compression methods, I'm going to download a number of
	// images from the People section on my website and then add them, in turn, to the
	// ZIP output stream.
	imageUrls = [
		"https://bennadel-cdn.com/images/header/photos/irl_2019_old_school_staff.jpg",
		"https://bennadel-cdn.com/images/header/photos/james_murray_connor_murphy_drew_newberry_alvin_mutisya_nick_miller_jack_neil.jpg",
		"https://bennadel-cdn.com/images/header/photos/juan_agustin_moyano_2.jpg",
		"https://bennadel-cdn.com/images/header/photos/jeremiah_lee_2.jpg",
		"https://bennadel-cdn.com/images/header/photos/wissam_abirached.jpg",
		"https://bennadel-cdn.com/images/header/photos/winnie_tong.jpg",
		"https://bennadel-cdn.com/images/header/photos/sean_roberts.jpg",
		"https://bennadel-cdn.com/images/header/photos/scott_markovits.jpg",
		"https://bennadel-cdn.com/images/header/photos/sara_dunnack_3.jpg",
		"https://bennadel-cdn.com/images/header/photos/salvatore_dagostino.jpg",
		"https://bennadel-cdn.com/images/header/photos/robbie_manalo_jessica_thorp.jpg",
		"https://bennadel-cdn.com/images/header/photos/rich_armstrong.jpg"
	];

	// ------------------------------------------------------------------------------- //
	// ------------------------------------------------------------------------------- //

	ZipEntryClass = createObject( "java", "java.util.zip.ZipEntry" );

	withTempDirectory(
		( imagesDirectory ) => {

			// Download the images in parallel.
			// --
			// NOTE: How cool is it that the file-IO operations in Lucee CFML will
			// seamlessly work with remote URLs? So bad-ass!
			imageUrls.each(
				( imageUrl ) => {

					fileCopy( imageUrl, "#imagesDirectory#/#getFileFromPath( imageUrl )#" );

				},
				true // Parallel processing, kablamo!
			);

			// Let's keep track of how long the various Zip METHODS take (which will
			// only include the archiving, not the downloading portion, above).
			var startedAt = getTickCount();

			// We'll generate the Zip archive in-memory, rather than writing it to disk.
			var binaryOutputStream = javaNew( "java.io.ByteArrayOutputStream" ).init();
			var zipOutputStream = javaNew( "java.util.zip.ZipOutputStream" )
				.init( binaryOutputStream )
			;

			// Now that we've downloaded the images, let's add each one to the Zip.
			for ( var imageUrl in imageUrls ) {

				var imageFilename = getFileFromPath( imageUrl );
				var imageBinary = fileReadBinary( "#imagesDirectory#/#imageFilename#" );

				var zipEntry = javaNew( "java.util.zip.ZipEntry" )
					.init( "streaming-zip/images/#imageFilename#" )
				;

				// The default method is DEFLATED, which compresses the entry as it adds
				// it to the archive. For some files, this results in wasted CPU; and, in
				// some cases, can even result in larger files (not smaller files). If we
				// just want to include the file in the archive, uncompressed, we can
				// used the STORED method. This will include the file as its raw size.
				// --
				// NOTE: As of Java 8 (where I am running this demo), a STORED file needs
				// to also set the SIZE and CRC of the entry or we get an error.
				if ( url.method == "stored" ) {

					zipEntry.setMethod( ZipEntryClass.STORED );
					zipEntry.setSize( arrayLen( imageBinary ) );
					zipEntry.setCrc( crc32( imageBinary ) );

				}

				zipOutputStream.putNextEntry( zipEntry );
				zipOutputStream.write( imageBinary );
				zipOutputStream.closeEntry();

			}

			// Finalize the Zip content.
			zipOutputStream.close();
			binaryOutputStream.close();

			// NOTE: We're baking the DURATION right into the filename.
			zipFilename = "people-#url.method#-#( getTickCount() - startedAt )#.zip";

			// Setup the response headers. By using the CFContent tag with [variable],
			// we'll implicitly reset the output buffers and use the given binary as the
			// response payload. CFContent will also terminate the request of request
			// processing (with the EXCEPTION of the FINALLY block in the method that
			// setup the temp directory).
			header
				name = "content-disposition"
				value = "attachment; filename=""#zipFilename#""; filename*=UTF-8''#urlEncodedFormat( '#zipFilename#' )#"
			;
			header
				name = "content-length"
				value = binaryOutputStream.size()
			;
			content
				type = "application/zip"
				variable = binaryOutputStream.toByteArray()
			;

		}
	); // END: withTempDirectory().

	// ------------------------------------------------------------------------------- //
	// ------------------------------------------------------------------------------- //

	/**
	* I compute the CRC-32 checksum for the byte array.
	*
	* @input I am the input being checked.
	*/
	public numeric function crc32( required binary input ) {

		var checksum = createObject( "java", "java.util.zip.CRC32" ).init();
		checksum.update( input );

		return( checksum.getValue() );
	}


	/**
	* I create a Java class instance with the given class name. This is just a short-hand
	* method for the createObject() call.
	* 
	* @className I am the Java class being created.
	*/
	public any function javaNew( required string className ) {

		return( createObject( "java", className ) );

	}


	/**
	* I create a temp directory for the images and then pass the directory path to the
	* given callback. Any return value from the callback is returned-through to the
	* calling context.
	* 
	* @callback I am the callback to be invoked with the temp directory path.
	*/
	public any function withTempDirectory( required function callback ) {

		var imagesDirectory = expandPath( "./images-#createUniqueId()#" );

		directoryCreate( imagesDirectory );

		try {

			return( callback( imagesDirectory ) );

		} finally {

			directoryDelete( imagesDirectory, true ); // True = recurse.

		}

	}

</cfscript>

As you can see, when I want to use the STORED method, I have to call .setMethod(), .setSize(), and .setCrc() on the ZipEntry - all three calls are required. Notice also that I am baking the method and the compression time (in milliseconds) into the archive filename. This way, we more clearly see the difference between the two methods right from the generated archives.

And, if we run this ColdFusion template using both methods, we get the following files:

The MacOS info modals for both the STORED and DEFLATED zip archive files.

As you can see, the DEFLATED method produces a slightly smaller Zip archive file (by about 18Kb) when compared to the STORED method. This is because - despite being already compressed - we were still able to squeeze some size out of the images. That said, if we look at the filenames, we can see that the STORED method ran about 33% faster than the DEFLATED method. Of course, the times will vary widely on each run; but, the general trend is that the STORED method is faster than the DEFLATED method since it's doing less work.

This was mostly a note-to-self since I couldn't figure out how to use the STORED method when I was putting yesterday's demo together; and, I wanted something that I could reference in the future. That said, I'm still a big fan of using the zip CLI (Command-Line Interface) in Lucee CFML since it's very fast; and will likely use the zip CLI as my primary means of zipping in the future.



Reader Comments

This is a great demo, but the thing that really hit me was a method you used, called:

getFileFromPath()

I must say, I have never seen this before. The amount of times, I have done something like:

ListLast(imageUrl,"/");
ListLast(systemPathl,"\");

This is what I love about your blogs. Not only do I learn new techniques but I learn about new native methods:)

Reply to this Comment

@Charles,

Oh man, getFileFromPath() and getDirectoryFromPath() are super helpful functions! I just wish they had a function for extracting the file-extension. Because, like your example, I still have to do that via listLast( filename, "." ).

On a side-note, I love list-functions in ColdFusion. Totally underrated.

Reply to this Comment

It's weird. I knew about:

getDirectoryFromPath()

So, I am not sure why I didn't know about ** getFileFromPath()**?

Yes. A second Boolean argument. If set to true, it would return a Struct containing:

{
filename: 'file.zip'
extension: 'zip'
}

Great idea!

Maybe I should put in a feature request!

Reply to this Comment

Post A Comment

You — Get Out Of My Dreams, Get Into My Blog
Live in the Now
Oops!
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.