ColdFusion CFFile vs. Java java.io.BufferedOutputStream

Posted October 2, 2006 at 8:05 AM

Tags: ColdFusion

Project Skin Spider is powered by a home-grown XML database system. This involves a lot of writing of data to files. And for any of you who have every used a database, you know that even simple databases can quickly build up huge repository of data. For this reason, even minor increases in file-writing performance can have an impact on an application that's constantly updating data.

Currently, my DatabaseService.cfc uses the ColdFusion tag, CFFile, to write the query data to an XML file. I did a little exploration to see how CFFile stacks up against straight Java I/O calls. I compared CFFile to a standard File Output Stream as well as a Buffered Output Stream. For the test, I basically come up with random phrases and write tens of thousands of them to disk.

As with all testing, let me start out by setting up the testing environment:

 Launch code in new window » Download code as text file »

  • <!--- Create an array of data to choose from. --->
  • <cfset arrParts = ArrayNew( 1 ) />
  •  
  • <!--- Add data to the array. --->
  • <cfset arrParts[ 1 ] = "Feet" />
  • <cfset arrParts[ 2 ] = "Calves" />
  • <cfset arrParts[ 3 ] = "Thighs" />
  • <cfset arrParts[ 4 ] = "Hips" />
  • <cfset arrParts[ 5 ] = "Bottom" />
  • <cfset arrParts[ 6 ] = "Boobs" />
  • <cfset arrParts[ 7 ] = "Eyes" />
  •  
  • <!---
  • Set the number of iterations. This is the number of lines
  • of text that we will end up writing to file.
  • --->
  • <cfset intIterations = 100000 />

Now, the way I currently use CFFile for large data that I build incrementally is to use the Java StringBuffer to create the output data. Then, I write the string buffer to disk. For those of you who are not familiar with the StringBuffer, it basically creates a more efficient way of creating large strings from smaller ones by putting of string concatenation until it is absolute necessary:

 Launch code in new window » Download code as text file »

  • <!---
  • Test the standard ColdFusion CFFile and Java
  • StringBuffer methodolog.
  • --->
  • <cftimer label="StringBuffer Test" type="outline">
  •  
  • <!---
  • Kill extra output. We want to do this because otherwise,
  • we are creating [intIterations] amount of white space
  • on the page. No need for that.
  • --->
  • <cfsilent>
  •  
  • <!--- Get the file name to write to. --->
  • <cfset strFilePath = ExpandPath( "./sb_output.txt" ) />
  •  
  • <!--- Create a string buffer. --->
  • <cfset sbOutput = CreateObject(
  • "java",
  • "java.lang.StringBuffer"
  • ).Init() />
  •  
  • <!---
  • Loop over tthe iterations to build up the string
  • buffer. For each iteration, we are going to
  • select a random string and add it to the buffer.
  • --->
  • <cfloop
  • index="intI"
  • from="1"
  • to="#intIterations#"
  • step="1">
  •  
  • <!--- Add a random string to the string buffer. --->
  • <cfset sbOutput.Append(
  • "I am crazy about " &
  • arrParts[ RandRange( 1, 7 ) ] &
  • Chr( 13) & Chr( 10 )
  • ) />
  •  
  • </cfloop>
  •  
  • <!---
  • Now that we have created the string buffer, write
  • the data to selected file name.
  • --->
  • <cffile
  • action="WRITE"
  • file="#strFilePath#"
  • output="#sbOutput.ToString()#"
  • />
  •  
  • </cfsilent>
  •  
  • <!--- Output name of file. --->
  • #strFilePath#
  •  
  • </cftimer>

This created a file that was roughly 2.3 MegaBytes. Then, I did the same thing, but used the Java FileOutputStream to write the file as I created the data (as opposed to a lump-sum writing at the end):

 Launch code in new window » Download code as text file »

  • <!---
  • Test the Java FileOuptputStream methodology of writing data
  • to disk as we get it, not just at the end.
  • --->
  • <cftimer label="FileOutputStream Test" type="outline">
  •  
  • <!---
  • Kill extra output. We want to do this because otherwise,
  • we are creating [intIterations] amount of white space
  • on the page. No need for that.
  • --->
  • <cfsilent>
  •  
  • <!--- Get the file name to write to. --->
  • <cfset strFilePath = ExpandPath( "./io_output.txt" ) />
  •  
  • <!---
  • Create the file output stream. When creating the file
  • output stream, we have to initialize it with a Java
  • File object (which we initialize with the path to
  • the file we want to create).
  • --->
  • <cfset osOutput = CreateObject(
  • "java",
  • "java.io.FileOutputStream"
  • ).Init(
  • CreateObject(
  • "java",
  • "java.io.File"
  • ).Init(
  • strFilePath
  • )
  • ) />
  •  
  •  
  • <!---
  • Loop over the iterations to build up the data. For
  • each iteration, we are going to select a random
  • string and write that string directly to the output
  • stream which should write it directly to file.
  • --->
  • <cfloop
  • index="intI"
  • from="1"
  • to="#intIterations#"
  • step="1">
  •  
  • <!---
  • Add a random string to the file output stream.
  • In this case, the Write() method of the output
  • stream takes a Byte Array. For that, we can
  • call the GetBytes() Java method on the string.
  • We use the ToString() method to create a string
  • object before getting the byte array.
  • --->
  • <cfset osOutput.Write(
  • ToString(
  • "I am crazy about " &
  • arrParts[ RandRange( 1, 7 ) ] &
  • Chr( 13) & Chr( 10 )
  • ).GetBytes()
  • ) />
  •  
  • </cfloop>
  •  
  • </cfsilent>
  •  
  • <!--- Output name of file. --->
  • #strFilePath#
  •  
  • </cftimer>

This created a file that was also roughly 2.3 MegaBytes. Now, I don't know that much about Java - this is all experimentation to me. I see that there is a BufferedOutputStream. That's got to be there for a reason and that reason has to be optimization:

By setting up such an output stream, an application can write bytes to the underlying output stream without necessarily causing a call to the underlying system for each byte written.

When creating the buffered output stream, you can create the buffer size. I tried this with three different buffer sizes (but will only show the code once as the buffer size is the only variable). I tried the default buffer size which is 512 bytes. Then I tried it with 2048 bytes and 5000 bytes:

 Launch code in new window » Download code as text file »

  • <!---
  • Test the Java BufferedOutputStream. In this case we are
  • testing the output stream with a buffer size of 2048 bytes,
  • but we can set it to anything we want.
  • --->
  • <cftimer label="BufferedFileOutput Stream Test" type="outline">
  •  
  • <!---
  • Kill extra output. We want to do this because otherwise,
  • we are creating [intIterations] amount of white space
  • on the page. No need for that.
  • --->
  • <cfsilent>
  •  
  • <!--- Get the file name to write to. --->
  • <cfset strFilePath = ExpandPath( "./bio_output2.txt" ) />
  •  
  • <!---
  • Create the buffered file output stream. When
  • creating the buffered file output stream, we have
  • to initialize it with a Java File Output stream,
  • which we, in turn, have to initialize with a Java
  • File Output Stream, which itself needs to be
  • initialized with a Java File object (which we
  • initialize with the path to the file we want
  • to create).
  •  
  • Additionally, the second argument of the buffered
  • output stream is the size of the buffer. In this
  • case we are using 2048 bytes.
  • --->
  • <cfset bosOutput = CreateObject(
  • "java",
  • "java.io.BufferedOutputStream"
  • ).Init(
  • CreateObject(
  • "java",
  • "java.io.FileOutputStream"
  • ).Init(
  • CreateObject(
  • "java",
  • "java.io.File"
  • ).Init(
  • strFilePath
  • )
  • ),
  • JavaCast( "int", 2048 )
  • ) />
  •  
  •  
  • <!---
  • Loop over the iterations to build up the data. For
  • each iteration, we are going to select a random
  • string and write that string directly to the
  • buffered output stream which should write it
  • directly to file output stream once the buffer is
  • populated with enough data.
  • --->
  • <cfloop
  • index="intI"
  • from="1"
  • to="#intIterations#"
  • step="1">
  •  
  • <!---
  • Create the random string. In this case we are
  • creating the string prior to buffer writing
  • because we will need it to get the length of
  • the data.
  • --->
  • <cfset strText = (
  • "I am crazy about " &
  • arrParts[ RandRange( 1, 7 ) ] &
  • Chr( 13) & Chr( 10 )
  • ) />
  •  
  •  
  • <!---
  • Add a random string to the buffered file output
  • stream. We want to write the entire byte array
  • to the output stream.
  • --->
  • <cfset bosOutput.Write(
  • strText.GetBytes(),
  • JavaCast( "int", 0 ),
  • strText.Length()
  • ) />
  •  
  • </cfloop>
  •  
  • </cfsilent>
  •  
  • <!--- Output name of file. --->
  • #strFilePath#
  •  
  • </cftimer>

As with many things, a small number of iterations yields absolutely no difference in speed. We, however, are performing 100,000 iterations. But, even at large iterations, the performance is inconsistent. When I run this test on my machine at home on my new Dell Inspiron E1505 Core Duo, the 5000 byte BufferedOutputStream outperforms everything but small margin, maybe a 100 ms or so. However, when I perform this test here at the office on our powerful server, the CFFile tag outperforms everything else by about 40 ms on repeated tests.

I wonder what that's all about?? I can say for sure though that the straight up Java FileOutputStream was by far the slowest performer. The ColdFusion CFFile tag and the Buffered output streams or any size performed faster. I guess you need to find a balance between data caching a file I/O.

It seems that I am doing alright with the CFFile methodology. But one other thing to note though, and this has more to do with my XML database system, performance is not the only consideration. Using CFFile and Java StringBuffer, I have to create the entire output data buffer in memory before I write it to disk. This can be hard on the system RAM. With an output stream, I can minimize the amount of data that gets stored in the system RAM at any given time.

Download Code Snippet ZIP File

Post Comment  |  Ask Ben  |  Permalink  |  Other Searches  |  Print Page



Learning ColdFusion 9 - ColdFusion 9 tutorials, samples, examples, demos

Reader Comments

Aug 5, 2007 at 6:26 PM // reply »
1 Comments

One thing to keep in mind when doing coldfusion vs java speed tests is some unexpected performance costs when calling java from coldfusion. coldfusion will attempt to analyze the java call's parameters as to data type, so as to convert them properly if necessary. In most instances it's not a consideration, but with large strings or other pieces of data it can be significant. The way to avoid the cost is to specifically 'javaCast' the input like this, javaCast("String", x).

I don't know if the above issue would impact your tests, just thought I would share.


Aug 6, 2007 at 7:20 AM // reply »
6,516 Comments

@Arthur,

Yeah, I have to get into using the JavaCast() method all the time. I get very lazy about it when it comes to methods that expect strings.


Jun 3, 2009 at 6:35 AM // reply »
3 Comments

Hi Ben - There's an issue with these methods in that if you don't issue a osOutput.close() then the files can sometimes remain open within JRun.

Martin


Post Comment  |  Ask Ben

Recent Blog Comments
Nov 22, 2009 at 1:56 AM
Learning ColdFusion 9: Using CFQuery In CFScript Can Enable SQL Injection Attacks
Why adobe would give you script equivalent of cfquery is beyond me. I love cfquery tag because it helps me wriite clean sql, and get away from the horrible jdbc queries If I wanted to write javali ... read »
Nov 22, 2009 at 1:45 AM
Streaming Text Using ColdFusion's CFContent Tag And The Variable Attribute
The reason you would want to do this is to stream. Ack json/xml files to ria clients I used thus technique before because putting json in response stream causes debugging info to come thru As well a ... read »
Nov 21, 2009 at 6:47 PM
Hal Helms - Real World Object Oriented Development, Sarasota - Day Five
@charlie griefer, Thank you.. ... read »
Nov 21, 2009 at 5:15 PM
Using ColdFusion Structures To Remove Duplicate List Values
@Jose Galdamez, Oh heh yeah I didn't paste the whole code. I should have defined the vars -- my bad. It's fixed thou. Thanks. ... read »
Nov 21, 2009 at 4:49 PM
Styling The ColdFusion 8 WriteToBrowser CFImage Output
Great work yet again Ben! Whilst I didn't use this whole code, I copied some of your regex code for a similar problem with the lack of an alt attribute and unescaped ampersands in CFIMAGE for Railo 3 ... read »
Nov 21, 2009 at 1:13 PM
My First ColdFusion Builder Extension - Encrypting And Decrypting CFM / CFC Files
@Ben, Because I am pedantic, I just want to make sure that everyone knows there is absolutely no encryption going on. There is only encoding and obfuscation. The cfencode tool only obfuscates your C ... read »
Nov 21, 2009 at 12:28 PM
Using ColdFusion Structures To Remove Duplicate List Values
@Jody I can't seem to get your code sample to work. If you are still having problems, try this code out and see if it gets you what you wanted. <!--- Comma delimited list with various duplicates ... read »
Nov 21, 2009 at 11:03 AM
Groovy Operator Overloading Does Not Work In The ColdFusion Context
Hi Ben, Thanks for this informative post. Now I am reading ur old posts too ... read »