Someone over on CF-Talk was asking about writing massive data files that are the result of several million records coming out of a database. One of the concerns expressed was that the problem was not only with the writing of the file, it was with the amount of memory that the massive record set would take up. Here is my stab at a solution that uses small CFQuery record sets as well as a buffered output stream that should ease the use of ColdFusion at both ends.
Not sure if this helps any one, it was really only for the dude on CF-Talk who has not done any Java. In this example, I pulling blog entries out of a database and writing them in quasi-XML format to a txt file. This isn't really anything that I do, I just needed something to test on.
Also, while I am not doing it in the example, you can wrap the whole thing in a CFSilent tag to stop output and whitespace from being sent to the browser. The ClearBuffer() line should take care of any white space in the ColdFusion buffer??? Not sure.
<!--- Get the number of records to read at one time. This will limit the amount of data that ColdFusion has to pull down from the SQL Server and store in it's memory. ---> <cfset intReadCount = 50 /> <!--- Get the offset for the read. ---> <cfset intOffset = 0 /> <!--- Set up the buffered output stream. The buffered output stream requires a file output stream which requires a file object. By using a buffered output stream we will limit the amount of data that ColdFusion has to store in memory AND we will moderate the number of write commands that need to take place. ---> <cfset objOutput = CreateObject( "java", "java.io.BufferedOutputStream" ).Init( <!--- File Output Stream. ---> CreateObject( "java", "java.io.FileOutputStream" ).Init( <!--- File object. ---> CreateObject( "java", "java.io.File" ).Init( <!--- File Path. ---> JavaCast( "string", ExpandPath( "./output.txt" ) ) ) ) ) /> <!--- Keep looping while we have records to read. This loop will only be broken manually once we have no more records getting returned from the database. ---> <cfloop condition="true"> <!--- Query for TOP ## records. ---> <cfquery name="qBlog" datasource="#REQUEST.DSN.Source#" username="#REQUEST.DSN.Username#" password="#REQUEST.DSN.Password#"> SELECT TOP #intReadCount# b.id, b.name, b.description, b.date_posted FROM blog_entry b WHERE b.id > <cfqueryparam value="#intOffset#" cfsqltype="CF_SQL_INTEGER" /> ORDER BY b.id ASC </cfquery> <!--- Check to see if we have a record count. If we do not, then we are out of data. Break out of the loop. ---> <cfif qBlog.RecordCount> <!--- We have returned records from the database. ---> <!--- Loop over the records. ---> <cfloop query="qBlog"> <!--- Create a string buffer to hold row output. ---> <cfset objRowBuffer = CreateObject( "java", "java.lang.StringBuffer" ).Init() /> <!--- Add ID. ---> <cfset objRowBuffer.Append( "<id>" & qBlog.id & "</id>" ) /> <!--- Add name. ---> <cfset objRowBuffer.Append( "<name>" & XmlFormat( qBlog.name ) & "</name>" ) /> <!--- Add description. ---> <cfset objRowBuffer.Append( "<description>" & XmlFormat( qBlog.description ) & "</description>" ) /> <!--- Add date posted. ---> <cfset objRowBuffer.Append( "<posted>" & XmlFormat( qBlog.date_posted ) & "</posted>" ) /> <!--- Add new line (for next record). ---> <cfset objRowBuffer.Append( Chr( 13 ) & Chr( 10 ) ) /> <!--- Convert the string buffer to a string (concatenate all the fields) and then get the character byte array from the resultant string. ---> <cfset arrRowBytes = objRowBuffer.ToString().GetBytes() /> <!--- Write the byte array to the output stream. ---> <cfset objOutput.Write( arrRowBytes, JavaCast( "int", 0 ), JavaCast( "int", ArrayLen( arrRowBytes ) ) ) /> </cfloop> <!--- Increment the offset. ---> <cfset intOffset = (intOffset + intReadCount) /> <!--- Reset the buffer. This should kill the white space that is building up in the ColdFusion memory space. ---> <cfset GetPageContext().GetOut().ClearBuffer() /> <cfelse> <!--- No more records. Stop looping. ---> <cfbreak /> </cfif> </cfloop> <!--- Flush the buffered output stream to make sure there is no straggling buffer data. ---> <cfset objOutput.Flush() />
Want to use code from this post? Check out the license.