Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
Ben Nadel at NCDevCon 2016 (Raleigh, NC) with: Carl Von Stetten
Ben Nadel at NCDevCon 2016 (Raleigh, NC) with: Carl Von Stetten@cfvonner )

Testing Performance Of Various Byte Array / Binary Value Generation Methods In ColdFusion

By Ben Nadel on
Tags: ColdFusion

The other day, I experimented with Java's ByteBuffer class as a means to generate a byte array / binary value in ColdFusion. In the past, I've used a whole host of techniques for creating byte arrays; but, I was terribly intrigued by the fact that ByteBuffer seemed so "close to the metal". Meaning, it appeared to be nothing more than a thin layer on top of the actual binary memory allocation. This got me thinking about performance and I wanted to see how the use of ByteBuffer compared to other techniques that I've used.

First, I wanted to confirm that all the techniques being tested actually generated the same binary value / byte array in ColdFusion. So, I generated small test values and then compared their HEX-encoded output:

  • <cfscript>
  •  
  • a = javaCast( "byte[]", [ 32, 32, 32, 32, 32 ] );
  •  
  • b = charsetDecode( repeatString( chr( 32 ), 5 ), "utf8" );
  •  
  • c = createObject( "java", "java.nio.ByteBuffer" )
  • .allocate( javaCast( "int", 5 ) )
  • .put( javaCast( "byte", 32 ) )
  • .put( javaCast( "byte", 32 ) )
  • .put( javaCast( "byte", 32 ) )
  • .put( javaCast( "byte", 32 ) )
  • .put( javaCast( "byte", 32 ) )
  • .array()
  • ;
  •  
  • dStream = createObject( "java", "java.io.ByteArrayOutputStream" )
  • .init( javaCast( "int", 5 ) )
  • ;
  • dStream.write( javaCast( "int", 32 ) );
  • dStream.write( javaCast( "int", 32 ) );
  • dStream.write( javaCast( "int", 32 ) );
  • dStream.write( javaCast( "int", 32 ) );
  • dStream.write( javaCast( "int", 32 ) );
  • d = dStream.toByteArray();
  •  
  • e = repeatString( chr( 32 ), 5 ).getBytes();
  •  
  •  
  • // ------------------------------------------------------------------------------- //
  • // ------------------------------------------------------------------------------- //
  •  
  •  
  • // Convert the values to HEX so we can easily compare them.
  • aHex = binaryEncode( a, "hex" );
  • bHex = binaryEncode( b, "hex" );
  • cHex = binaryEncode( c, "hex" );
  • dHex = binaryEncode( d, "hex" );
  • eHex = binaryEncode( e, "hex" );
  •  
  • writeOutput( "a: #aHex#<br />" );
  • writeOutput( "b: #bHex#<br />" );
  • writeOutput( "c: #cHex#<br />" );
  • writeOutput( "d: #cHex#<br />" );
  • writeOutput( "e: #eHex#<br />" );
  •  
  • // Test equality through substitution principle.
  • if ( ( aHex == bHex ) && ( bHex == cHex ) && ( cHex == dHex ) && ( dHex == eHex ) ) {
  •  
  • writeOutput( "All values are equal" );
  •  
  • } else {
  •  
  • writeOutput( "Something went wrong!" );
  •  
  • }
  •  
  • </cfscript>

As you can see, each byte array in this test is of length 5 and is filled with the first-byte representation of the integer 32 (which is also the decimal value for the ACSII character Space). When we run this code, we get the following output:

a: 2020202020
b: 2020202020
c: 2020202020
d: 2020202020
e: 2020202020
All values are equal

Excellent - all the approaches I'm about to test for performance provide the same output.

Before we look at performance, though, let me just talk about why I used "32" in the above test. Ideally, I would have liked to just fill each binary value with the zero byte (0). But, the problem is that:

repeatString( chr( 0 ), 5 )

... doesn't do what you might expect. It actually returns an empty string. As such, I can't use the repeatString() approach to create a zero-filled byte array. But, I think that's OK for many cases. In ColdFusion, when you create a byte array / binary value, it's often times just an intermediary buffer into which binary data is going to be written. As such, the initial values in the byte array don't matter - they get overwritten as part of their workflow. That's why my performance test below uses a Space in some of the perf tests.

That said, let's look at some simplistic performance measurements: how many times can I generate a byte array / binary value in a given amount of time:

  • <cfscript>
  •  
  • // NOTE: This is here as a global variable for one of the tests.
  • ByteBuffer = createObject( "java", "java.nio.ByteBuffer" );
  •  
  •  
  • // Test each method to see how many iterations can be run in the same duration.
  • writeOutput( "ByteBuffer: #numberFormat( runTest( testByteBuffer, 2000 ) )#<br />" );
  • writeOutput( "ByteBuffer2: #numberFormat( runTest( testByteBuffer2, 2000 ) )#<br />" );
  • writeOutput( "Manual Build: #numberFormat( runTest( testManualBuild, 2000 ) )#<br />" );
  • writeOutput( "Repeat String: #numberFormat( runTest( testRepeatString, 2000 ) )#<br />" );
  • writeOutput( "Repeat String2: #numberFormat( runTest( testRepeatString2, 2000 ) )#<br />" );
  • writeOutput( "Output Stream: #numberFormat( runTest( testByteArrayOutputStream, 2000 ) )#<br />" );
  •  
  •  
  • // ------------------------------------------------------------------------------- //
  • // ------------------------------------------------------------------------------- //
  •  
  •  
  • // I test the ByteArrayOutputStream by writing individual bytes to the stream.
  • public void function testByteArrayOutputStream() {
  •  
  • var stream = createObject( "java", "java.io.ByteArrayOutputStream" )
  • .init( javaCast( "int", 1024 ) )
  • ;
  •  
  • for ( var i = 1 ; i <= 1024 ; i++ ) {
  •  
  • stream.write( javaCast( "int", 0 ) );
  •  
  • }
  •  
  • var buffer = stream.toByteArray();
  •  
  • }
  •  
  •  
  • // I test the ByteBuffer by creating the ByteBuffer static class each time.
  • public void function testByteBuffer() {
  •  
  • var buffer = createObject( "java", "java.nio.ByteBuffer" )
  • .allocate( javaCast( "int", 1024 ) )
  • .array()
  • ;
  •  
  • }
  •  
  •  
  • // I test the ByteBuffer by using a cached instance of the ByteBuffer class.
  • // --
  • // NOTE: This is a more realistic test as the Java class would likely be cached
  • // inside whichever ColdFusion component was generating the binary values.
  • public void function testByteBuffer2() {
  •  
  • var buffer = ByteBuffer
  • .allocate( javaCast( "int", 1024 ) )
  • .array()
  • ;
  •  
  • }
  •  
  •  
  • // I test the manual building of individual bytes.
  • public void function testManualBuild() {
  •  
  • var bytes = [];
  •  
  • arrayResize( bytes, 1024 );
  • arraySet( bytes, 1, 1024, 0 );
  •  
  • var buffer = javaCast( "byte[]", bytes );
  •  
  • }
  •  
  •  
  • // I test the string-to-bytes approach.
  • // --
  • // CAUTION: We are not actually creating an equivalent buffer here since this one
  • // will be filled with the byte 32 (space). That said, I'm including this as part
  • // of the test because this is completely fine in cases where all you need is an
  • // intermediary buffer into which you will write bytes and then read those bytes.
  • // In those cases, the initial value of the bytes in the buffer don't matter.
  • public void function testRepeatString() {
  •  
  • var buffer = charsetDecode( repeatString( " ", 1024 ), "utf8" );
  •  
  • }
  •  
  •  
  • // I test the string-to-bytes approach using the hidden Java method, getBytes().
  • // --
  • // CAUTION: We are not actually creating an equivalent buffer here since this one
  • // will be filled with the byte 32 (space). That said, I'm including this as part
  • // of the test because this is completely fine in cases where all you need is an
  • // intermediary buffer into which you will write bytes and then read those bytes.
  • // In those cases, the initial value of the bytes in the buffer don't matter.
  • public void function testRepeatString2() {
  •  
  • var buffer = repeatString( " ", 1024 ).getBytes();
  •  
  • }
  •  
  •  
  • // ------------------------------------------------------------------------------- //
  • // ------------------------------------------------------------------------------- //
  •  
  •  
  • // I run the given callback as many times as I can in the given duration and return
  • // the iteration count.
  • public numeric function runTest(
  • required any callback,
  • required numeric duration
  • ) {
  •  
  • var targetTick = ( getTickCount() + duration );
  • var count = 0;
  •  
  • while ( getTickCount() < targetTick ) {
  •  
  • count++;
  • callback();
  •  
  • }
  •  
  • return( count );
  •  
  • }
  •  
  • </cfscript>

As you can see, each test has 2-seconds to run as many times as it can. And, when we run the above code a few times, we get the following output:

NOTE: I am altering the output to list in order of performance.

Repeat String: 1,521,846
ByteBuffer2: 1,499,936
ByteBuffer: 1,056,772
Repeat String2: 252,411
Manual Build: 24,430
Output Stream: 6,086

ByteBuffer2: 1,650,297
Repeat String: 1,471,474
ByteBuffer: 1,057,972
Repeat String2: 256,842
Manual Build: 25,372
Output Stream: 5,720

ByteBuffer2: 1,692,674
Repeat String: 1,528,587
ByteBuffer: 1,064,559
Repeat String2: 263,168
Manual Build: 25,102
Output Stream: 6,553

ByteBuffer2: 1,643,425
Repeat String: 1,476,806
ByteBuffer: 1,051,993
Repeat String2: 255,554
Manual Build: 24,505
Output Stream: 6,502

In general, using ByteBuffer with a cached instance of the ByteBuffer Java class is the fastest. But, if you look at the first run, the repeatString() approach - with charsetDecode() - actually won. I'm kind of blown away that the repeatString() approach was so fast! I thought for sure it was going to be a dog on performance.

That said, there are two repeatString() tests - one that uses charsetDecode() and one that uses the underlying Java method, .getBytes(). Oddly enough, the .getBytes() approach was super slow. I'm surprise that there is a measurable difference between these two approaches. I guess the .getBytes() method is doing something unexpected.

In most situations, it probably doesn't matter which technique you use for creating byte arrays / binary values in ColdFusion. But, the ByteBuffer approach (with a cached class) is generally the fastest. Of course, the repeatString() approach is a surprisingly close second in terms of performance; and, it's certainly the easiest one to use if you don't care what values are in the resultant byte array.



Looking For A New Job?

100% of job board revenue is donated to Kiva. Loans that change livesFind out more »

Reader Comments

Post A Comment

You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
NEW: Some basic markdown formatting is now supported: bold, italic, blockquotes, lists, fenced code-blocks. Read more about markdown syntax »
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.