Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at Scotch On The Rock (SOTR) 2010 (London) with:

Updated: Converting A ColdFusion Query To CSV Using QueryToCSV()

By Ben Nadel on
Tags: ColdFusion

Last week, I posted a ColdFusion user defined function that took a ColdFusion query object and turned it into a comma separated value (CSV) string. I had made heavy use of the Java StringBuffer to build the string. This however, required many method calls and extra string concatenation that apparently was slowing things down. This is a good example of what happens when you try to optimize something without fully weighing the pros and cons of the tools in use.

Using some of the feedback that I received on my function, I am posting an updated version of it here:

  • <cffunction
  • name="QueryToCSV"
  • access="public"
  • returntype="string"
  • output="false"
  • hint="I take a query and convert it to a comma separated value string.">
  •  
  • <!--- Define arguments. --->
  • <cfargument
  • name="Query"
  • type="query"
  • required="true"
  • hint="I am the query being converted to CSV."
  • />
  •  
  • <cfargument
  • name="Fields"
  • type="string"
  • required="true"
  • hint="I am the list of query fields to be used when creating the CSV value."
  • />
  •  
  • <cfargument
  • name="CreateHeaderRow"
  • type="boolean"
  • required="false"
  • default="true"
  • hint="I flag whether or not to create a row of header values."
  • />
  •  
  • <cfargument
  • name="Delimiter"
  • type="string"
  • required="false"
  • default=","
  • hint="I am the field delimiter in the CSV value."
  • />
  •  
  • <!--- Define the local scope. --->
  • <cfset var LOCAL = {} />
  •  
  • <!---
  • First, we want to set up a column index so that we can
  • iterate over the column names faster than if we used a
  • standard list loop on the passed-in list.
  • --->
  • <cfset LOCAL.ColumnNames = [] />
  •  
  • <!---
  • Loop over column names and index them numerically. We
  • are working with an array since I believe its loop times
  • are faster than that of a list.
  • --->
  • <cfloop
  • index="LOCAL.ColumnName"
  • list="#ARGUMENTS.Fields#"
  • delimiters=",">
  •  
  • <!--- Store the current column name. --->
  • <cfset ArrayAppend(
  • LOCAL.ColumnNames,
  • Trim( LOCAL.ColumnName )
  • ) />
  •  
  • </cfloop>
  •  
  • <!--- Store the column count. --->
  • <cfset LOCAL.ColumnCount = ArrayLen( LOCAL.ColumnNames ) />
  •  
  •  
  • <!---
  • Now that we have our index in place, let's create
  • a string buffer to help us build the CSV value more
  • effiently.
  • --->
  • <cfset LOCAL.Buffer = CreateObject( "java", "java.lang.StringBuffer" ).Init() />
  •  
  • <!--- Create a short hand for the new line characters. --->
  • <cfset LOCAL.NewLine = (Chr( 13 ) & Chr( 10 )) />
  •  
  •  
  • <!--- Check to see if we need to add a header row. --->
  • <cfif ARGUMENTS.CreateHeaderRow>
  •  
  • <!--- Create array to hold row data. --->
  • <cfset LOCAL.RowData = [] />
  •  
  • <!--- Loop over the column names. --->
  • <cfloop
  • index="LOCAL.ColumnIndex"
  • from="1"
  • to="#LOCAL.ColumnCount#"
  • step="1">
  •  
  • <!--- Add the field name to the row data. --->
  • <cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#LOCAL.ColumnNames[ LOCAL.ColumnIndex ]#""" />
  •  
  • </cfloop>
  •  
  • <!--- Append the row data to the string buffer. --->
  • <cfset LOCAL.Buffer.Append(
  • JavaCast(
  • "string",
  • (
  • ArrayToList(
  • LOCAL.RowData,
  • ARGUMENTS.Delimiter
  • ) &
  • LOCAL.NewLine
  • ))
  • ) />
  •  
  • </cfif>
  •  
  •  
  • <!---
  • Now that we have dealt with any header value, let's
  • convert the query body to CSV. When doing this, we are
  • going to qualify each field value. This is done be
  • default since it will be much faster than actually
  • checking to see if a field needs to be qualified.
  • --->
  •  
  • <!--- Loop over the query. --->
  • <cfloop query="ARGUMENTS.Query">
  •  
  • <!--- Create array to hold row data. --->
  • <cfset LOCAL.RowData = [] />
  •  
  • <!--- Loop over the columns. --->
  • <cfloop
  • index="LOCAL.ColumnIndex"
  • from="1"
  • to="#LOCAL.ColumnCount#"
  • step="1">
  •  
  • <!--- Add the field to the row data. --->
  • <cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#Replace( ARGUMENTS.Query[ LOCAL.ColumnNames[ LOCAL.ColumnIndex ] ][ ARGUMENTS.Query.CurrentRow ], """", """""", "all" )#""" />
  •  
  • </cfloop>
  •  
  •  
  • <!--- Append the row data to the string buffer. --->
  • <cfset LOCAL.Buffer.Append(
  • JavaCast(
  • "string",
  • (
  • ArrayToList(
  • LOCAL.RowData,
  • ARGUMENTS.Delimiter
  • ) &
  • LOCAL.NewLine
  • ))
  • ) />
  •  
  • </cfloop>
  •  
  •  
  • <!--- Return the CSV value. --->
  • <cfreturn LOCAL.Buffer.ToString() />
  • </cffunction>

Now, rather than calling StringBuffer.Append() on every field value, I am build an array of fields in a row, then I join that array using ArrayToList() and append it to the ongoing string value that is held in the StringBuffer. So, I am still using the Java StringBuffer to hold the CSV value, but I am using an intermediary ColdFusion array to build up individual line values. I am also escaping any double-quotes that are in the query values - something which I was neglecting to do before. I have to say that this did greatly simplify the amount of code that was in the method. This pleases me; I love to see more simple solutions be more effective.

Some people did tell me that the StringBuffer was completely overkill and that they, in fact, were getting better results using just array-joining for both the fields and the rows. I found this hard to believe as the StringBuffer has proven useful so many times in the past. I had trouble believing that using just array joining would be the fastest method, and so, I did a little testing of my own. I re-wrote the above user defined method to use only arrays and no StringBuffer:

  • <cffunction
  • name="QueryToCSV2"
  • access="public"
  • returntype="string"
  • output="false"
  • hint="I take a query and convert it to a comma separated value string.">
  •  
  • <!--- Define arguments. --->
  • <cfargument
  • name="Query"
  • type="query"
  • required="true"
  • hint="I am the query being converted to CSV."
  • />
  •  
  • <cfargument
  • name="Fields"
  • type="string"
  • required="true"
  • hint="I am the list of query fields to be used when creating the CSV value."
  • />
  •  
  • <cfargument
  • name="CreateHeaderRow"
  • type="boolean"
  • required="false"
  • default="true"
  • hint="I flag whether or not to create a row of header values."
  • />
  •  
  • <cfargument
  • name="Delimiter"
  • type="string"
  • required="false"
  • default=","
  • hint="I am the field delimiter in the CSV value."
  • />
  •  
  • <!--- Define the local scope. --->
  • <cfset var LOCAL = {} />
  •  
  • <!---
  • First, we want to set up a column index so that we can
  • iterate over the column names faster than if we used a
  • standard list loop on the passed-in list.
  • --->
  • <cfset LOCAL.ColumnNames = [] />
  •  
  • <!---
  • Loop over column names and index them numerically. We
  • are going to be treating this struct almost as if it
  • were an array. The reason we are doing this is that
  • look-up times on a table are a bit faster than look
  • up times on an array (or so I have been told).
  • --->
  • <cfloop
  • index="LOCAL.ColumnName"
  • list="#ARGUMENTS.Fields#"
  • delimiters=",">
  •  
  • <!--- Store the current column name. --->
  • <cfset ArrayAppend(
  • LOCAL.ColumnNames,
  • Trim( LOCAL.ColumnName )
  • ) />
  •  
  • </cfloop>
  •  
  • <!--- Store the column count. --->
  • <cfset LOCAL.ColumnCount = ArrayLen( LOCAL.ColumnNames ) />
  •  
  •  
  • <!--- Create a short hand for the new line characters. --->
  • <cfset LOCAL.NewLine = (Chr( 13 ) & Chr( 10 )) />
  •  
  • <!--- Create an array to hold the set of row data. --->
  • <cfset LOCAL.Rows = [] />
  •  
  •  
  • <!--- Check to see if we need to add a header row. --->
  • <cfif ARGUMENTS.CreateHeaderRow>
  •  
  • <!--- Create array to hold row data. --->
  • <cfset LOCAL.RowData = [] />
  •  
  • <!--- Loop over the column names. --->
  • <cfloop
  • index="LOCAL.ColumnIndex"
  • from="1"
  • to="#LOCAL.ColumnCount#"
  • step="1">
  •  
  • <!--- Add the field name to the row data. --->
  • <cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#LOCAL.ColumnNames[ LOCAL.ColumnIndex ]#""" />
  •  
  • </cfloop>
  •  
  • <!--- Append the row data to the string buffer. --->
  • <cfset ArrayAppend(
  • LOCAL.Rows,
  • ArrayToList( LOCAL.RowData, ARGUMENTS.Delimiter )
  • ) />
  •  
  • </cfif>
  •  
  •  
  • <!---
  • Now that we have dealt with any header value, let's
  • convert the query body to CSV. When doing this, we are
  • going to qualify each field value. This is done be
  • default since it will be much faster than actually
  • checking to see if a field needs to be qualified.
  • --->
  •  
  • <!--- Loop over the query. --->
  • <cfloop query="ARGUMENTS.Query">
  •  
  • <!--- Create array to hold row data. --->
  • <cfset LOCAL.RowData = [] />
  •  
  • <!--- Loop over the columns. --->
  • <cfloop
  • index="LOCAL.ColumnIndex"
  • from="1"
  • to="#LOCAL.ColumnCount#"
  • step="1">
  •  
  • <!--- Add the field to the row data. --->
  • <cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#Replace( ARGUMENTS.Query[ LOCAL.ColumnNames[ LOCAL.ColumnIndex ] ][ ARGUMENTS.Query.CurrentRow ], """", """""", "all" )#""" />
  •  
  • </cfloop>
  •  
  •  
  • <!--- Append the row data to the string buffer. --->
  • <cfset ArrayAppend(
  • LOCAL.Rows,
  • ArrayToList( LOCAL.RowData, ARGUMENTS.Delimiter )
  • ) />
  •  
  • </cfloop>
  •  
  •  
  • <!---
  • Return the CSV value by joining all the rows together
  • into one string.
  • --->
  • <cfreturn ArrayToList(
  • LOCAL.Rows,
  • LOCAL.NewLine
  • ) />
  • </cffunction>

As you can see, this uses ArrayToList() not only for the individual field values, but for all of the rows. Then, I put both of these to a basic speed test on several thousand rows of data:

  • <!--- Create a new part query. --->
  • <cfset qPart = QueryNew(
  • "id, serial_number, price",
  • "cf_sql_integer, cf_sql_varchar, cf_sql_decimal"
  • ) />
  •  
  •  
  • <!--- Create some record. --->
  • <cfloop
  • index="intI"
  • from="1"
  • to="5000"
  • step="1">
  •  
  • <!--- Add a row to the query. --->
  • <cfset QueryAddRow( qPart ) />
  •  
  • <!--- Populate row with random data. --->
  • <cfset qPart[ "id" ][ qPart.RecordCount ] = JavaCast( "int", qPart.RecordCount ) />
  • <cfset qPart[ "serial_number" ][ qPart.RecordCount ] = JavaCast( "string", CreateUUID() ) />
  • <cfset qPart[ "price" ][ qPart.RecordCount ] = JavaCast( "float", RandRange( 1, 100 ) ) />
  •  
  • </cfloop>
  •  
  •  
  • <!--- Test the StringBuffer version. --->
  • <cftimer
  • type="outline"
  • label="Java String Buffer">
  •  
  • <!--- Get the CSV value. --->
  • <cfset strOutput = QueryToCSV(
  • qPart,
  • "id, serial_number, price"
  • ) />
  •  
  • Done.
  •  
  • </cftimer>
  •  
  •  
  • <!--- Test the array-only version. --->
  • <cftimer
  • type="outline"
  • label="ArrayToList() Only">
  •  
  • <!--- Get the CSV value. --->
  • <cfset strOutput = QueryToCSV2(
  • qPart,
  • "id, serial_number, price"
  • ) />
  •  
  • Done.
  •  
  • </cftimer>

I am happy to say that both of them executed in milliseconds, so no matter which way you go, you can't really go wrong. The StringBuffer method did, however, slightly outperform the ArrayToList() method over about 10 trials:

QueryToCSV() - ArrayToList() And StringBuffer

Average Time: 295 ms

QueryToCSV2() - ArrayToList() Only

Average Time: 373 ms

As you can see, we're really only talking fractions of a second, so very little difference. That being said, as much as the StringBuffer was slightly faster, definitely go with whatever version you feel more comfortable with - they are both quite fast.




Reader Comments

Hi Ben,

Nice little research!... However I am really interested to know about the performance stats of POIUtility for this same scenario...

Reply to this Comment

This is a nice utility, thanks for creating it. However, the one thing that doesn't work for me is date fields.

Reply to this Comment

@Chris,

I suppose you can either modify your SQL to format the date before you pass it to the function. Or, you can add column-type checks to the CSV method to check for date data types. I am not sure which way I would go. Personally, I think I would lean towards formatting the date in the SQL.

Reply to this Comment

This is what I changed to make it work the way I wanted:

<cfset LOCAL.querydata = ARGUMENTS.Query[ LOCAL.ColumnNames[ LOCAL.ColumnIndex ] ][ ARGUMENTS.Query.CurrentRow ] >

<cfif isdate(LOCAL.querydata)>
<cfset LOCAL.querydata = dateformat(local.querydata,"mmm dd, yyyy")>
</cfif>

<!--- Add the field to the row data. --->
<cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#Replace( local.querydata, """", """""", "all" )#""" />

Reply to this Comment

Hello Ben,

I don't want to have the Chr(10)&chr(13) at the end of each csv file i generate with you're function. Do you you an idea for fix your QueryToCSV function ?

Regards

Reply to this Comment

@Laurent,

Piece of cake - only add the new line characters if you are NOT at the end of the query loop. You'll probably need something like:

<cfif (ARGUMENTS.Query.CurrentRow NEQ ARGUMENTS.Query.RecordCount)>

.... add new line.

</cfif>

This way, when the query is in the last row of the loop, it will not add the new line.

Reply to this Comment

Hey Ben,

Just a heads up that i used this function on a coldfusion 7 server and it didn't work.

After a quick look, I realised that the function uses coldfusion 8 implicit struct and array creation.

Might be worthwhile putting a note up about this for anyone else using older versions of coldfusion.

Jon

p.s I've used this function heaps of times on coldfusion 8 servers before I noticed this today. Keep up the good work!

Reply to this Comment

@Jon,

Yeah, that's always a tough thing to deal with - version compatibility. Glad you like it when you're on a CF8 box. Technically, you can get this to work on a CF7 box if you swap out bits, but its a pain.

Reply to this Comment

Ben,
Recently we have been converting many of our sites from Application.cfm to Application.cfc. In that file we store our component classes as an application scoped variable in the onApplicationStart function. For example:
<code>
<cfset application.remarketingGateway = createObject("component","components.remarketingGateway").init(application.dsn)>
</code>

In the above code application.remarketingGateway holds all the methods which report sales. In those whe produce a variety of reports in XML,CSV and PDF and we used to call it this way:

<code>
<cfinvoke component="components.remarketingGateway" method="currentBenchmarkReport" returnvariable="getCurrentBenchmarkReport">
<cfinvokeargument name="custID" value="#Session.Login.Company.CustID#">
<cfinvokeargument name="startDate" value="#form.fromDate#">
<cfinvokeargument name="endDate" value="#form.toDate#">
</cfinvoke>
</code>

But when we utilize the components scoped in the Application.cfc file this way:

<code>
<cfset getCurrentBenchmarkReport = application.remarketingGateway.currentBenchmarkReport(Session.Login.Company.CustID, form.fromDate, #form.toDate#)>
</code>

We get the following error:

The routine QueryToCSV has been declared twice in different templates.

We know why this happens and have reverted back to the old way, but I thought it was an interesting issue you might like to test for yourself. Feel free to contact me for further details.

Reply to this Comment

@Christopher,

That's a really odd error. I can't imagine why it is doing that if you are only calling methods on cached components. Very strange.

Reply to this Comment

How easy would it be to just to print out all the fields by default instead if having to specify them all?

Reply to this Comment

@Nikos,

You could default the Fields argument to be the column list from the passed-in query:

<cfargument
name="Fields"
required="false"
default="#arguments.query.columnList#"
/>

Reply to this Comment

I ask because I usually just want to have a stored proc get specific data for a specific csv function and don't want to have to type out all the columns like this:

<cfreturn QueryToCSV(sp,"Client name,Client ID,Trade date,Due date,Transaction,Deposit / refund CCY,Deposit / refund amount")>

many thanks bro

Reply to this Comment

Which tag would you use to save this string as a CSV file? Thanks Ben This is awesome!

Reply to this Comment

@Josh,

You could use either the CFFile/Write tag or the fileWrite() method (depending on what version of ColdFusion you have).

Reply to this Comment

Thanks mate, I done that but I get this error:

faultCode:Server.Processing faultString:'[Table (rows 30 columns CLIENT NAME, CLIENT ID, TRADE DATE, DUE DATE, TRANSACTION, 'DEPOSIT / REFUND CCY', 'DEPOSIT / REFUND AMOUNT'): [CLIENT NAME: coldfusion.sql.QueryColumn@629fe5] [CLIENT ID: coldfusion.sql.QueryColumn@ef9676] [TRADE DATE: coldfusion.sql.QueryColumn@17f30f2] [DUE DATE: coldfusion.sql.QueryColumn@1667e62] [TRANSACTION: coldfusion.sql.QueryColumn@1f361fb] ['DEPOSIT / REFUND CCY': coldfusion.sql.QueryColumn@1a5da07] ['DEPOSIT / REFUND AMOUNT': coldfusion.sql.QueryColumn@18d2964] ] is not indexable by arguments.query.columnList' faultDetail:'null'

Reply to this Comment

@Ben Nadel,

Thanks but if I do that it orders it alphabetically, not the same order from the database. I think its a colfusion setting.

Reply to this Comment

@Nikos,

Yes, the column list comes out of the query in alphabetical order. If you want to custom order them, just pass them in with the method call.

I am not sure what the order of the columns in the database have to do with anything? I am confused?

Reply to this Comment

Well I had the sp in the database return the columns in the order I wanted the csv to show thats all :)

Reply to this Comment

@Nikos,

Ah, gotcha. You can try using the undocumented method on the query object:

Query.GetColumnNames()

I believe this returns a list of columns in the original order of the select statement... but not 100% sure.

Reply to this Comment

hi Ben,

This is what i want awesome,but i and some help from you regarding date values.

i am sending date column to from query to CV but when i open CSV that date field values point to time not date values in CSV . Will you please guide me how to change that to date.

Below is what chris posted on your blog about date values.

Will you please tell me where i can add this code to work with date values coming from query.

Thanks very much for this nice awesome code.

@chris for your comment
This is what I changed to make it work the way I wanted:

<cfset LOCAL.querydata = ARGUMENTS.Query[ LOCAL.ColumnNames[ LOCAL.ColumnIndex ] ][ ARGUMENTS.Query.CurrentRow ] >

<cfif isdate(LOCAL.querydata)>
<cfset LOCAL.querydata = dateformat(local.querydata,"mmm dd, yyyy")>
</cfif>

<!--- Add the field to the row data. --->
<cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#Replace( local.querydata, """", """""", "all" )#""" />

Reply to this Comment

Hey guys,
I have recently been involved in a project which requires exporting data to CSV and originally used this UDF (it works great by the way!) but I would like to point out that in ColdFusion9 the cfspreadsheet tag can output a query to .XLS, CSV or HTML!

Reply to this Comment

@CFNew,

It's been a while since I've done anything in Excel; but, I believe that if you are only seeing Time, it's actually a setting on the Excel column. Try looking at the formatting of the column cells to see if there is an option show date AND time.

@Joe,

Very cool! I still do most of my programming on ColdFusion 8 these days (I know, I know), so I haven't really explored the CFSpreadSheet tag too much. From the demos I've seen, it looks to be pretty badass.

Reply to this Comment

Hi Ben,

why do you need to have so many double quotes when adding the field and field name to the row data?

-----------------------------------------
<cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#LOCAL.ColumnNames[ LOCAL.ColumnIndex ]#""" />

<cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#Replace( ARGUMENTS.Query[ LOCAL.ColumnNames[ LOCAL.ColumnIndex ] ][ ARGUMENTS.Query.CurrentRow ], """", """""", "all" )#""" />
-----------------------------------------

I am not getting that part. Also, I need to add " actually to my field values.

Thanks for that code really useful and sorry for digging out some old posts !

Cheers,
K.

Reply to this Comment

Thanks for writing this. I added the following to make the fields argument optional.

  • <cfif isDefined( "arguments.fields" ) >
  • <cfloop
  • index="LOCAL.ColumnName"
  • list="#ARGUMENTS.Fields#"
  • delimiters=",">
  •  
  • <!--- Store the current column name. --->
  • <cfset ArrayAppend(
  • LOCAL.ColumnNames,
  • Trim( LOCAL.ColumnName )
  • ) />
  •  
  • </cfloop>
  • <cfelse>
  • <cfset LOCAL.ColumnNames = qresults.getMetaData().getColumnLabels() >
  • </cfif>

Reply to this Comment

I am using this to export stored procedure data to a CSV and it works great! The only change I made was to check the CurrentRow against the RecordCount to break the loop. I was getting multiple newline chars at the end of the file.

<cfif (ARGUMENTS.Query.CurrentRow LTE ARGUMENTS.Query.RecordCount)>
<!--- Add the field to the row data. --->
<cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#Replace( ARGUMENTS.Query[ LOCAL.ColumnNames[ LOCAL.ColumnIndex ] ][ ARGUMENTS.Query.CurrentRow ], """", """""", "all" )#""" />
</cfif>

I also have a requirement to generate this as a read-only file. I have researched adding an attribute to the file, but have came up empty. Any suggestions on how to do this?

Reply to this Comment

Post A Comment

?
You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.