For years, ColdFusion has had the hash() function for taking variable-length string data and creating one-way "fingerprints" of the original value. This function has changed over time to include algorithm and encoding options; but, it has always worked with string data. Now, with ColdFusion 10, the hash() function has been enhanced to accept binary data (aka. byte arrays). This means that we can now create one-way "fingerprints" of binary values.
NOTE: At the time of this writing, ColdFusion 10 was in public beta.
To demonstrate, let's read in an image file in binary format and output the hash of the image's binary data:
<cfscript> // Read in the raw Binary data of the image. imageBinary = fileReadBinary( expandPath( "./gina_carano.jpg" ) ); // Get the hash of the byte array (that IS the image). imageHash = hash( imageBinary ); // Output the image "fingerprint". writeOutput( "Fingerprint: " & hash( imageBinary ) ); </cfscript>
As you can see, the first argument of the hash() function can now accept a byte array. When we run the above code, we get the following output:
As you can see, the default MD5 algorithm has taken our byte array (binary data) and returned our standard 32-character Hexadecimal string.
Out of curiosity, I wanted to see how the hashing of a Binary value would relate to the hashing of its String representation. As such, I tried using the hash() function to hash both a TXT file and the string content contained within that TXT file:
<cfscript> // Create our string message. message = "It's Friday, Friday - you gotta get down on Friday!"; // Write message to file. fileWrite( expandPath( "./message.txt" ), message ); // Read the message in as binary. messageBinary = fileReadBinary( expandPath( "./message.txt" ) ); // Output the string "fingerprint". writeOutput( "STR Fingerprint: " & hash( message ) & "<br />" ); // Output the binary "fingerprint". writeOutput( "BIN Fingerprint: " & hash( messageBinary ) ); </cfscript>
This time, when we run the above code, we get the following output:
STR Fingerprint: 60408C08C4AB05073FCEC10FAAE3915E
BIN Fingerprint: 60408C08C4AB05073FCEC10FAAE3915E
Here, you can see that the hash() of the string content was the same as the hash of the file itself. I don't really have any conclusions to draw from this last experiment - it was just an interesting thing to see. And, since each string character is defined by a single byte, this equivalence relationship probably makes sense.
As a final note, I should also point out that the hash() function now takes an argument that defines the number of hashing iterations to apply to the target value. The number-of-iterations enhancement is a security feature and is beyond what I would be able to explain in a meaningful way. I'll defer to the security experts to elucidate that one.
Want to use code from this post? Check out the license.