Skip to main content
Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.

Using Both STORE And DEFLATE Compression Methods With The zip CLI In Lucee CFML 5.3.6.61

By Ben Nadel on
Tags: ColdFusion

A couple of months ago, I looked as using the zip CLI with the STORE or DEFLATE compression methods in Lucee CFML. The DEFLATE compression method attempts to shrink file sizes as it adds the files to an archive where as the STORE method just adds the files to the archive, but doesn't attempt to compress them in any way. This morning, I wanted to take a quick look at how we can apply both the STORE and DEFLATE methods in the same zip command execution in Lucee CFML 5.3.6.61.

The reason I'm looking into this is because - at least in theory - compressing a file takes CPU time. And, if some files, like Images, are already in a compressed file-format, it might not be worth the CPU cost to try and compress those image files while adding them to a zip archive file.

To accomplish this mixed compression in a single zip call, I'm going to use the -n / --suffixes CLI argument. This argument uses a colon-delimited list of file-extensions to determine which files to include via the STORE method; and, which files to include via the DEFLATE method.

CAUTION: The --suffixes argument is case-sensitive. As such, a suffix of .png will not match against the input file, Image.PNG.

To test the outcome of this argument, I'm going to compress a directory that contains both Image files and HTML files. The images files don't benefit as much from compression; at least when comparison to HTML file, which can be heavily compressed.

In the following test, I'm going to create three files:

  • One using the STORE method (ie, no compression).
  • One using the DEFLATE method (ie, compress everything).
  • One using both STORE and DEFLATE (ie, mixed compression).
<cfscript>

	// The data directory has a mixture of Images (which are already persisted using a
	// compressed file-format) and large HTML files (which can be compressed).
	dataDirectory = expandPath( "./data" );

	// ------------------------------------------------------------------------------- //
	// ------------------------------------------------------------------------------- //	

	// First, let's test the performance and outcome of the zip CLI when we use NO
	// COMPRESSION at all. This will store the files in an archive, but will not attempt
	// to safe any file-size.
	timer
		label = "No Compression (-0)"
		type = "outline"
		{

		archiveFilePath = expandPath( "./output/no-compression.zip" );

		zipOutput = executeZipFromDirectory(
			dataDirectory,
			[
				// Don't use any compression. This will be the fastest approach, but will
				// not result in any file-size advantage.
				"-0",
				// Recurse the input directory.
				"--recurse-paths",
				// Define the OUTPUT file path for the generated zip.
				archiveFilePath,
				// Define the INPUT file - NOTE that this path is RELATIVE TO THE WORKING
				// DIRECTORY! By using a relative directory, it allows us to generate a
				// ZIP file in which the relative paths become the entries in the
				// resultant archive.
				"./"
			]
		);

		echo( "File size: " & getFileSize( archiveFilePath ) );
		echo( "<pre>" & zipOutput & "</pre>" );

	}

	// Next, let's test the default behavior of the zip CLI. This uses a compression
	// setting of -6, which will attempt to compress all files.
	timer
		label = "Default Compression (-6)"
		type = "outline"
		{

		archiveFilePath = expandPath( "./output/default-compression.zip" );

		zipOutput = executeZipFromDirectory(
			dataDirectory,
			[
				// Recurse the input directory.
				"--recurse-paths",
				// Define the OUTPUT file path for the generated zip.
				archiveFilePath,
				// Define the INPUT file - NOTE that this path is RELATIVE TO THE WORKING
				// DIRECTORY! By using a relative directory, it allows us to generate a
				// ZIP file in which the relative paths become the entries in the
				// resultant archive.
				"./"
			]
		);

		echo( "File size: " & getFileSize( archiveFilePath ) );
		echo( "<pre>" & zipOutput & "</pre>" );

	}

	// And, finally, let's test the performance and outcome of the zip CLI when we use
	// the default compression, but tell the CLI to store any IMAGE FILES WITHOUT
	// COMPRESSION. This will include images in the archive, but will not attempt to
	// improve upon the already-compressed file-formats.
	timer
		label = "Mixed Compression (-6 + suffixes)"
		type = "outline"
		{

		archiveFilePath = expandPath( "./output/mixed-compression.zip" );

		// We are going to tell the zip CLI to skip compression for files with the given
		// set of file-extensions. This uses a colon-delimited list of extensions.
		// --
		// CAUTION: Unfortunately, these suffix values are CASE-SENSITIVE.
		suffixes = [ ".gif", ".jpeg", ".jpg", ".png" ].toList( ":" );

		zipOutput = executeZipFromDirectory(
			dataDirectory,
			[
				// Recurse the input directory.
				"--recurse-paths",
				// Define which files will be archived using the STORAGE method (no
				// compression) instead of DEFLATE.
				"--suffixes #suffixes#",
				// Define the OUTPUT file path for the generated zip.
				archiveFilePath,
				// Define the INPUT file - NOTE that this path is RELATIVE TO THE WORKING
				// DIRECTORY! By using a relative directory, it allows us to generate a
				// ZIP file in which the relative paths become the entries in the
				// resultant archive.
				"./"
			]
		);

		echo( "File size: " & getFileSize( archiveFilePath ) );
		echo( "<pre>" & zipOutput & "</pre>" );

	}

	// ------------------------------------------------------------------------------- //
	// ------------------------------------------------------------------------------- //
	
	/**
	* I execute the zip command-line utility from the given WORKING DIRECTORY using the
	* given arguments. If error-output is returned from the utility, an error with the
	* details is thrown.
	* 
	* @workingDirectory I am the working directory from which to execute the zip command.
	* @zipArguments I am the command-line arguments for zip.
	*/
	public string function executeZipFromDirectory(
		required string workingDirectory,
		required array zipArguments
		) {

		// The Shell Script that's going to proxy the ZIP command is expecting the
		// working directory to be the first argument. As such, let's create a normalized
		// set of arguments for our proxy that contains the working directory first,
		// followed by the rest of the commands.
		var normalizedArguments = [ workingDirectory ]
			.append( "zip" )
			.append( zipArguments, true )
		;

		execute
			name = expandPath( "./execute_from_directory.sh" )
			arguments = normalizedArguments.toList( " " )
			variable = "local.successOutput"
			errorVariable = "local.errorOutput"
			timeout = 30
			terminateOnTimeout = true
		;

		if ( len( errorOutput ?: "" ) ) {

			throw(
				type = "ZipFromDirectoryError",
				message = "The zip command-line proxy returned error output.",
				detail = "Error: #errorOutput#",
				extendedInfo = "Working directory: #workingDirectory#, Command-line arguments: #serializeJson( zipArguments )#"
			);

		}

		return( successOutput ?: "" );

	}


	/**
	* I return a string representing the byte-size of the given file.
	* 
	* @filepath I am the file to inspect.
	*/
	public string function getFileSize( required string filepath ) {

		return( numberFormat( fileInfo( filepath ).size ) );

	}

</cfscript>

As you can see, we're using the three different approaches; and, for each approach, we're outputting the file-size of the resultant archive, the time it took to generate it, and any output returned by the zip CLI. And, when we run the above ColdFusion code, we get the following output:

ZIP archived generated using different compression methods in Lucee CFML.

As you can see, when using the default compression method (DEFLATE) with the --suffixes argument, we can apply compression to the HTML files and skip compression for the image files. This results in a slightly larger zip archive; but, may reduce load on the CPU.

NOTE: In this screen-shot, the mixed-compression was faster; but, that was not always the case. Sometimes, when I ran this ColdFusion code, the default compression was actually faster. But, I have to keep in mind that this is not a production environment that's serving a hundred-plus concurrent requests - it's a development environment without load. As such, it's not exactly clear how this will perform in a production environment. I will just assume that reducing CPU load is going to be a benefit more often than not.

It's also interesting to note that PNG files seem to actually benefit from some decent compression. Though, I assume that depends on the content of the PNG. My PNGs tend to include a lot of repeated colors, which I assume is exactly what compression likes to see.

Anyway, this was just a fun exploration of the zip CLI tool in Lucee CFML.

Epilogue on execute_from_directory.sh

As you may have noticed in my code, I'm using the CFExecute tag to invoke the zip CLI. However, I'm not doing it directly. Instead, I'm proxying the zip CLI through a user-defined script, execute_from_directory.sh. I have to do this because, at this time, you cannot run the CFExecute tag from a working directory. As such, I use a this script to proxy other commands from a working directory:

#!/bin/sh

# In the current script invocation, the first argument needs to be the WORKING DIRECTORY
# from whence the rest of the script will be executed.
working_directory=$1

# Now that we have the working directory argument saved, SHIFT IT OFF the arguments list.
# This will leave us with a "$@" array that contains the REST of the arguments.
shift

# Move to the target working directory.
cd "$working_directory"

# Execute the REST of command from within the new working directory.
# --
# NOTE: The $@ is a special array in BASH that contains the input arguments used to
# invoke the current executable.
"$@"

I'm looking forward to an upcoming release of Lucee CFML where the CFExecute tag has been updated to include a working-directory concept. It's coming soon, I believe!



Reader Comments

What has two thumbs and hopes you leave a comment? This Guy! (Ben Nadel).

Post A Comment

You — Get Out Of My Dreams, Get Into My Blog
Live in the Now
Oops!
NEW: Some basic markdown formatting is now supported: bold, italic, blockquotes, lists, fenced code-blocks. Read more about markdown syntax »
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.