Testing Performance Of Various Byte Array / Binary Value Generation Methods In ColdFusion
The other day, I experimented with Java's ByteBuffer class as a means to generate a byte array / binary value in ColdFusion. In the past, I've used a whole host of techniques for creating byte arrays; but, I was terribly intrigued by the fact that ByteBuffer seemed so "close to the metal". Meaning, it appeared to be nothing more than a thin layer on top of the actual binary memory allocation. This got me thinking about performance and I wanted to see how the use of ByteBuffer compared to other techniques that I've used.
First, I wanted to confirm that all the techniques being tested actually generated the same binary value / byte array in ColdFusion. So, I generated small test values and then compared their HEX-encoded output:
- <cfscript>
-
- a = javaCast( "byte[]", [ 32, 32, 32, 32, 32 ] );
-
- b = charsetDecode( repeatString( chr( 32 ), 5 ), "utf8" );
-
- c = createObject( "java", "java.nio.ByteBuffer" )
- .allocate( javaCast( "int", 5 ) )
- .put( javaCast( "byte", 32 ) )
- .put( javaCast( "byte", 32 ) )
- .put( javaCast( "byte", 32 ) )
- .put( javaCast( "byte", 32 ) )
- .put( javaCast( "byte", 32 ) )
- .array()
- ;
-
- dStream = createObject( "java", "java.io.ByteArrayOutputStream" )
- .init( javaCast( "int", 5 ) )
- ;
- dStream.write( javaCast( "int", 32 ) );
- dStream.write( javaCast( "int", 32 ) );
- dStream.write( javaCast( "int", 32 ) );
- dStream.write( javaCast( "int", 32 ) );
- dStream.write( javaCast( "int", 32 ) );
- d = dStream.toByteArray();
-
- e = repeatString( chr( 32 ), 5 ).getBytes();
-
-
- // ------------------------------------------------------------------------------- //
- // ------------------------------------------------------------------------------- //
-
-
- // Convert the values to HEX so we can easily compare them.
- aHex = binaryEncode( a, "hex" );
- bHex = binaryEncode( b, "hex" );
- cHex = binaryEncode( c, "hex" );
- dHex = binaryEncode( d, "hex" );
- eHex = binaryEncode( e, "hex" );
-
- writeOutput( "a: #aHex#<br />" );
- writeOutput( "b: #bHex#<br />" );
- writeOutput( "c: #cHex#<br />" );
- writeOutput( "d: #cHex#<br />" );
- writeOutput( "e: #eHex#<br />" );
-
- // Test equality through substitution principle.
- if ( ( aHex == bHex ) && ( bHex == cHex ) && ( cHex == dHex ) && ( dHex == eHex ) ) {
-
- writeOutput( "All values are equal" );
-
- } else {
-
- writeOutput( "Something went wrong!" );
-
- }
-
- </cfscript>
As you can see, each byte array in this test is of length 5 and is filled with the first-byte representation of the integer 32 (which is also the decimal value for the ACSII character Space). When we run this code, we get the following output:
a: 2020202020
b: 2020202020
c: 2020202020
d: 2020202020
e: 2020202020
All values are equal
Excellent - all the approaches I'm about to test for performance provide the same output.
Before we look at performance, though, let me just talk about why I used "32" in the above test. Ideally, I would have liked to just fill each binary value with the zero byte (0). But, the problem is that:
repeatString( chr( 0 ), 5 )
... doesn't do what you might expect. It actually returns an empty string. As such, I can't use the repeatString() approach to create a zero-filled byte array. But, I think that's OK for many cases. In ColdFusion, when you create a byte array / binary value, it's often times just an intermediary buffer into which binary data is going to be written. As such, the initial values in the byte array don't matter - they get overwritten as part of their workflow. That's why my performance test below uses a Space in some of the perf tests.
That said, let's look at some simplistic performance measurements: how many times can I generate a byte array / binary value in a given amount of time:
- <cfscript>
-
- // NOTE: This is here as a global variable for one of the tests.
- ByteBuffer = createObject( "java", "java.nio.ByteBuffer" );
-
-
- // Test each method to see how many iterations can be run in the same duration.
- writeOutput( "ByteBuffer: #numberFormat( runTest( testByteBuffer, 2000 ) )#<br />" );
- writeOutput( "ByteBuffer2: #numberFormat( runTest( testByteBuffer2, 2000 ) )#<br />" );
- writeOutput( "Manual Build: #numberFormat( runTest( testManualBuild, 2000 ) )#<br />" );
- writeOutput( "Repeat String: #numberFormat( runTest( testRepeatString, 2000 ) )#<br />" );
- writeOutput( "Repeat String2: #numberFormat( runTest( testRepeatString2, 2000 ) )#<br />" );
- writeOutput( "Output Stream: #numberFormat( runTest( testByteArrayOutputStream, 2000 ) )#<br />" );
-
-
- // ------------------------------------------------------------------------------- //
- // ------------------------------------------------------------------------------- //
-
-
- // I test the ByteArrayOutputStream by writing individual bytes to the stream.
- public void function testByteArrayOutputStream() {
-
- var stream = createObject( "java", "java.io.ByteArrayOutputStream" )
- .init( javaCast( "int", 1024 ) )
- ;
-
- for ( var i = 1 ; i <= 1024 ; i++ ) {
-
- stream.write( javaCast( "int", 0 ) );
-
- }
-
- var buffer = stream.toByteArray();
-
- }
-
-
- // I test the ByteBuffer by creating the ByteBuffer static class each time.
- public void function testByteBuffer() {
-
- var buffer = createObject( "java", "java.nio.ByteBuffer" )
- .allocate( javaCast( "int", 1024 ) )
- .array()
- ;
-
- }
-
-
- // I test the ByteBuffer by using a cached instance of the ByteBuffer class.
- // --
- // NOTE: This is a more realistic test as the Java class would likely be cached
- // inside whichever ColdFusion component was generating the binary values.
- public void function testByteBuffer2() {
-
- var buffer = ByteBuffer
- .allocate( javaCast( "int", 1024 ) )
- .array()
- ;
-
- }
-
-
- // I test the manual building of individual bytes.
- public void function testManualBuild() {
-
- var bytes = [];
-
- arrayResize( bytes, 1024 );
- arraySet( bytes, 1, 1024, 0 );
-
- var buffer = javaCast( "byte[]", bytes );
-
- }
-
-
- // I test the string-to-bytes approach.
- // --
- // CAUTION: We are not actually creating an equivalent buffer here since this one
- // will be filled with the byte 32 (space). That said, I'm including this as part
- // of the test because this is completely fine in cases where all you need is an
- // intermediary buffer into which you will write bytes and then read those bytes.
- // In those cases, the initial value of the bytes in the buffer don't matter.
- public void function testRepeatString() {
-
- var buffer = charsetDecode( repeatString( " ", 1024 ), "utf8" );
-
- }
-
-
- // I test the string-to-bytes approach using the hidden Java method, getBytes().
- // --
- // CAUTION: We are not actually creating an equivalent buffer here since this one
- // will be filled with the byte 32 (space). That said, I'm including this as part
- // of the test because this is completely fine in cases where all you need is an
- // intermediary buffer into which you will write bytes and then read those bytes.
- // In those cases, the initial value of the bytes in the buffer don't matter.
- public void function testRepeatString2() {
-
- var buffer = repeatString( " ", 1024 ).getBytes();
-
- }
-
-
- // ------------------------------------------------------------------------------- //
- // ------------------------------------------------------------------------------- //
-
-
- // I run the given callback as many times as I can in the given duration and return
- // the iteration count.
- public numeric function runTest(
- required any callback,
- required numeric duration
- ) {
-
- var targetTick = ( getTickCount() + duration );
- var count = 0;
-
- while ( getTickCount() < targetTick ) {
-
- count++;
- callback();
-
- }
-
- return( count );
-
- }
-
- </cfscript>
As you can see, each test has 2-seconds to run as many times as it can. And, when we run the above code a few times, we get the following output:
NOTE: I am altering the output to list in order of performance.
Repeat String: 1,521,846
ByteBuffer2: 1,499,936
ByteBuffer: 1,056,772
Repeat String2: 252,411
Manual Build: 24,430
Output Stream: 6,086
ByteBuffer2: 1,650,297
Repeat String: 1,471,474
ByteBuffer: 1,057,972
Repeat String2: 256,842
Manual Build: 25,372
Output Stream: 5,720
ByteBuffer2: 1,692,674
Repeat String: 1,528,587
ByteBuffer: 1,064,559
Repeat String2: 263,168
Manual Build: 25,102
Output Stream: 6,553
ByteBuffer2: 1,643,425
Repeat String: 1,476,806
ByteBuffer: 1,051,993
Repeat String2: 255,554
Manual Build: 24,505
Output Stream: 6,502
In general, using ByteBuffer with a cached instance of the ByteBuffer Java class is the fastest. But, if you look at the first run, the repeatString() approach - with charsetDecode() - actually won. I'm kind of blown away that the repeatString() approach was so fast! I thought for sure it was going to be a dog on performance.
That said, there are two repeatString() tests - one that uses charsetDecode() and one that uses the underlying Java method, .getBytes(). Oddly enough, the .getBytes() approach was super slow. I'm surprise that there is a measurable difference between these two approaches. I guess the .getBytes() method is doing something unexpected.
In most situations, it probably doesn't matter which technique you use for creating byte arrays / binary values in ColdFusion. But, the ByteBuffer approach (with a cached class) is generally the fastest. Of course, the repeatString() approach is a surprisingly close second in terms of performance; and, it's certainly the easiest one to use if you don't care what values are in the resultant byte array.
Reader Comments