Skip to main content
Ben Nadel at the New York ColdFusion User Group (Apr. 2008) with: Dmitriy Goltseker
Ben Nadel at the New York ColdFusion User Group (Apr. 2008) with: Dmitriy Goltseker

Updated: Converting A ColdFusion Query To CSV Using QueryToCSV()

By on
Tags:

Last week, I posted a ColdFusion user defined function that took a ColdFusion query object and turned it into a comma separated value (CSV) string. I had made heavy use of the Java StringBuffer to build the string. This however, required many method calls and extra string concatenation that apparently was slowing things down. This is a good example of what happens when you try to optimize something without fully weighing the pros and cons of the tools in use.

Using some of the feedback that I received on my function, I am posting an updated version of it here:

<cffunction
	name="QueryToCSV"
	access="public"
	returntype="string"
	output="false"
	hint="I take a query and convert it to a comma separated value string.">

	<!--- Define arguments. --->
	<cfargument
		name="Query"
		type="query"
		required="true"
		hint="I am the query being converted to CSV."
		/>

	<cfargument
		name="Fields"
		type="string"
		required="true"
		hint="I am the list of query fields to be used when creating the CSV value."
		/>

	<cfargument
		name="CreateHeaderRow"
		type="boolean"
		required="false"
		default="true"
		hint="I flag whether or not to create a row of header values."
		/>

	<cfargument
		name="Delimiter"
		type="string"
		required="false"
		default=","
		hint="I am the field delimiter in the CSV value."
		/>

	<!--- Define the local scope. --->
	<cfset var LOCAL = {} />

	<!---
		First, we want to set up a column index so that we can
		iterate over the column names faster than if we used a
		standard list loop on the passed-in list.
	--->
	<cfset LOCAL.ColumnNames = [] />

	<!---
		Loop over column names and index them numerically. We
		are working with an array since I believe its loop times
		are faster than that of a list.
	--->
	<cfloop
		index="LOCAL.ColumnName"
		list="#ARGUMENTS.Fields#"
		delimiters=",">

		<!--- Store the current column name. --->
		<cfset ArrayAppend(
			LOCAL.ColumnNames,
			Trim( LOCAL.ColumnName )
			) />

	</cfloop>

	<!--- Store the column count. --->
	<cfset LOCAL.ColumnCount = ArrayLen( LOCAL.ColumnNames ) />


	<!---
		Now that we have our index in place, let's create
		a string buffer to help us build the CSV value more
		effiently.
	--->
	<cfset LOCAL.Buffer = CreateObject( "java", "java.lang.StringBuffer" ).Init() />

	<!--- Create a short hand for the new line characters. --->
	<cfset LOCAL.NewLine = (Chr( 13 ) & Chr( 10 )) />


	<!--- Check to see if we need to add a header row. --->
	<cfif ARGUMENTS.CreateHeaderRow>

		<!--- Create array to hold row data. --->
		<cfset LOCAL.RowData = [] />

		<!--- Loop over the column names. --->
		<cfloop
			index="LOCAL.ColumnIndex"
			from="1"
			to="#LOCAL.ColumnCount#"
			step="1">

			<!--- Add the field name to the row data. --->
			<cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#LOCAL.ColumnNames[ LOCAL.ColumnIndex ]#""" />

		</cfloop>

		<!--- Append the row data to the string buffer. --->
		<cfset LOCAL.Buffer.Append(
			JavaCast(
				"string",
				(
					ArrayToList(
						LOCAL.RowData,
						ARGUMENTS.Delimiter
						) &
					LOCAL.NewLine
				))
			) />

	</cfif>


	<!---
		Now that we have dealt with any header value, let's
		convert the query body to CSV. When doing this, we are
		going to qualify each field value. This is done be
		default since it will be much faster than actually
		checking to see if a field needs to be qualified.
	--->

	<!--- Loop over the query. --->
	<cfloop query="ARGUMENTS.Query">

		<!--- Create array to hold row data. --->
		<cfset LOCAL.RowData = [] />

		<!--- Loop over the columns. --->
		<cfloop
			index="LOCAL.ColumnIndex"
			from="1"
			to="#LOCAL.ColumnCount#"
			step="1">

			<!--- Add the field to the row data. --->
			<cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#Replace( ARGUMENTS.Query[ LOCAL.ColumnNames[ LOCAL.ColumnIndex ] ][ ARGUMENTS.Query.CurrentRow ], """", """""", "all" )#""" />

		</cfloop>


		<!--- Append the row data to the string buffer. --->
		<cfset LOCAL.Buffer.Append(
			JavaCast(
				"string",
				(
					ArrayToList(
						LOCAL.RowData,
						ARGUMENTS.Delimiter
						) &
					LOCAL.NewLine
				))
			) />

	</cfloop>


	<!--- Return the CSV value. --->
	<cfreturn LOCAL.Buffer.ToString() />
</cffunction>

Now, rather than calling StringBuffer.Append() on every field value, I am build an array of fields in a row, then I join that array using ArrayToList() and append it to the ongoing string value that is held in the StringBuffer. So, I am still using the Java StringBuffer to hold the CSV value, but I am using an intermediary ColdFusion array to build up individual line values. I am also escaping any double-quotes that are in the query values - something which I was neglecting to do before. I have to say that this did greatly simplify the amount of code that was in the method. This pleases me; I love to see more simple solutions be more effective.

Some people did tell me that the StringBuffer was completely overkill and that they, in fact, were getting better results using just array-joining for both the fields and the rows. I found this hard to believe as the StringBuffer has proven useful so many times in the past. I had trouble believing that using just array joining would be the fastest method, and so, I did a little testing of my own. I re-wrote the above user defined method to use only arrays and no StringBuffer:

<cffunction
	name="QueryToCSV2"
	access="public"
	returntype="string"
	output="false"
	hint="I take a query and convert it to a comma separated value string.">

	<!--- Define arguments. --->
	<cfargument
		name="Query"
		type="query"
		required="true"
		hint="I am the query being converted to CSV."
		/>

	<cfargument
		name="Fields"
		type="string"
		required="true"
		hint="I am the list of query fields to be used when creating the CSV value."
		/>

	<cfargument
		name="CreateHeaderRow"
		type="boolean"
		required="false"
		default="true"
		hint="I flag whether or not to create a row of header values."
		/>

	<cfargument
		name="Delimiter"
		type="string"
		required="false"
		default=","
		hint="I am the field delimiter in the CSV value."
		/>

	<!--- Define the local scope. --->
	<cfset var LOCAL = {} />

	<!---
		First, we want to set up a column index so that we can
		iterate over the column names faster than if we used a
		standard list loop on the passed-in list.
	--->
	<cfset LOCAL.ColumnNames = [] />

	<!---
		Loop over column names and index them numerically. We
		are going to be treating this struct almost as if it
		were an array. The reason we are doing this is that
		look-up times on a table are a bit faster than look
		up times on an array (or so I have been told).
	--->
	<cfloop
		index="LOCAL.ColumnName"
		list="#ARGUMENTS.Fields#"
		delimiters=",">

		<!--- Store the current column name. --->
		<cfset ArrayAppend(
			LOCAL.ColumnNames,
			Trim( LOCAL.ColumnName )
			) />

	</cfloop>

	<!--- Store the column count. --->
	<cfset LOCAL.ColumnCount = ArrayLen( LOCAL.ColumnNames ) />


	<!--- Create a short hand for the new line characters. --->
	<cfset LOCAL.NewLine = (Chr( 13 ) & Chr( 10 )) />

	<!--- Create an array to hold the set of row data. --->
	<cfset LOCAL.Rows = [] />


	<!--- Check to see if we need to add a header row. --->
	<cfif ARGUMENTS.CreateHeaderRow>

		<!--- Create array to hold row data. --->
		<cfset LOCAL.RowData = [] />

		<!--- Loop over the column names. --->
		<cfloop
			index="LOCAL.ColumnIndex"
			from="1"
			to="#LOCAL.ColumnCount#"
			step="1">

			<!--- Add the field name to the row data. --->
			<cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#LOCAL.ColumnNames[ LOCAL.ColumnIndex ]#""" />

		</cfloop>

		<!--- Append the row data to the string buffer. --->
		<cfset ArrayAppend(
			LOCAL.Rows,
			ArrayToList( LOCAL.RowData, ARGUMENTS.Delimiter )
			) />

	</cfif>


	<!---
		Now that we have dealt with any header value, let's
		convert the query body to CSV. When doing this, we are
		going to qualify each field value. This is done be
		default since it will be much faster than actually
		checking to see if a field needs to be qualified.
	--->

	<!--- Loop over the query. --->
	<cfloop query="ARGUMENTS.Query">

		<!--- Create array to hold row data. --->
		<cfset LOCAL.RowData = [] />

		<!--- Loop over the columns. --->
		<cfloop
			index="LOCAL.ColumnIndex"
			from="1"
			to="#LOCAL.ColumnCount#"
			step="1">

			<!--- Add the field to the row data. --->
			<cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#Replace( ARGUMENTS.Query[ LOCAL.ColumnNames[ LOCAL.ColumnIndex ] ][ ARGUMENTS.Query.CurrentRow ], """", """""", "all" )#""" />

		</cfloop>


		<!--- Append the row data to the string buffer. --->
		<cfset ArrayAppend(
			LOCAL.Rows,
			ArrayToList( LOCAL.RowData, ARGUMENTS.Delimiter )
			) />

	</cfloop>


	<!---
		Return the CSV value by joining all the rows together
		into one string.
	--->
	<cfreturn ArrayToList(
		LOCAL.Rows,
		LOCAL.NewLine
		) />
</cffunction>

As you can see, this uses ArrayToList() not only for the individual field values, but for all of the rows. Then, I put both of these to a basic speed test on several thousand rows of data:

<!--- Create a new part query. --->
<cfset qPart = QueryNew(
	"id, serial_number, price",
	"cf_sql_integer, cf_sql_varchar, cf_sql_decimal"
	) />


<!--- Create some record. --->
<cfloop
	index="intI"
	from="1"
	to="5000"
	step="1">

	<!--- Add a row to the query. --->
	<cfset QueryAddRow( qPart ) />

	<!--- Populate row with random data. --->
	<cfset qPart[ "id" ][ qPart.RecordCount ] = JavaCast( "int", qPart.RecordCount ) />
	<cfset qPart[ "serial_number" ][ qPart.RecordCount ] = JavaCast( "string", CreateUUID() ) />
	<cfset qPart[ "price" ][ qPart.RecordCount ] = JavaCast( "float", RandRange( 1, 100 ) ) />

</cfloop>


<!--- Test the StringBuffer version. --->
<cftimer
	type="outline"
	label="Java String Buffer">

	<!--- Get the CSV value. --->
	<cfset strOutput = QueryToCSV(
		qPart,
		"id, serial_number, price"
		) />

	Done.

</cftimer>


<!--- Test the array-only version. --->
<cftimer
	type="outline"
	label="ArrayToList() Only">

	<!--- Get the CSV value. --->
	<cfset strOutput = QueryToCSV2(
		qPart,
		"id, serial_number, price"
		) />

	Done.

</cftimer>

I am happy to say that both of them executed in milliseconds, so no matter which way you go, you can't really go wrong. The StringBuffer method did, however, slightly outperform the ArrayToList() method over about 10 trials:

QueryToCSV() - ArrayToList() And StringBuffer

Average Time: 295 ms

QueryToCSV2() - ArrayToList() Only

Average Time: 373 ms

As you can see, we're really only talking fractions of a second, so very little difference. That being said, as much as the StringBuffer was slightly faster, definitely go with whatever version you feel more comfortable with - they are both quite fast.

Want to use code from this post? Check out the license.

Reader Comments

6 Comments

Hi Ben,

Nice little research!... However I am really interested to know about the performance stats of POIUtility for this same scenario...

2 Comments

This is a nice utility, thanks for creating it. However, the one thing that doesn't work for me is date fields.

15,665 Comments

@Chris,

I suppose you can either modify your SQL to format the date before you pass it to the function. Or, you can add column-type checks to the CSV method to check for date data types. I am not sure which way I would go. Personally, I think I would lean towards formatting the date in the SQL.

2 Comments

This is what I changed to make it work the way I wanted:

<cfset LOCAL.querydata = ARGUMENTS.Query[ LOCAL.ColumnNames[ LOCAL.ColumnIndex ] ][ ARGUMENTS.Query.CurrentRow ] >

<cfif isdate(LOCAL.querydata)>
<cfset LOCAL.querydata = dateformat(local.querydata,"mmm dd, yyyy")>
</cfif>

<!--- Add the field to the row data. --->
<cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#Replace( local.querydata, """", """""", "all" )#""" />

2 Comments

Hello Ben,

I don't want to have the Chr(10)&chr(13) at the end of each csv file i generate with you're function. Do you you an idea for fix your QueryToCSV function ?

Regards

15,665 Comments

@Laurent,

Piece of cake - only add the new line characters if you are NOT at the end of the query loop. You'll probably need something like:

<cfif (ARGUMENTS.Query.CurrentRow NEQ ARGUMENTS.Query.RecordCount)>

.... add new line.

</cfif>

This way, when the query is in the last row of the loop, it will not add the new line.

1 Comments

Hey Ben,

Just a heads up that i used this function on a coldfusion 7 server and it didn't work.

After a quick look, I realised that the function uses coldfusion 8 implicit struct and array creation.

Might be worthwhile putting a note up about this for anyone else using older versions of coldfusion.

Jon

p.s I've used this function heaps of times on coldfusion 8 servers before I noticed this today. Keep up the good work!

15,665 Comments

@Jon,

Yeah, that's always a tough thing to deal with - version compatibility. Glad you like it when you're on a CF8 box. Technically, you can get this to work on a CF7 box if you swap out bits, but its a pain.

1 Comments

Ben,
Recently we have been converting many of our sites from Application.cfm to Application.cfc. In that file we store our component classes as an application scoped variable in the onApplicationStart function. For example:
<code>
<cfset application.remarketingGateway = createObject("component","components.remarketingGateway").init(application.dsn)>
</code>

In the above code application.remarketingGateway holds all the methods which report sales. In those whe produce a variety of reports in XML,CSV and PDF and we used to call it this way:

<code>
<cfinvoke component="components.remarketingGateway" method="currentBenchmarkReport" returnvariable="getCurrentBenchmarkReport">
<cfinvokeargument name="custID" value="#Session.Login.Company.CustID#">
<cfinvokeargument name="startDate" value="#form.fromDate#">
<cfinvokeargument name="endDate" value="#form.toDate#">
</cfinvoke>
</code>

But when we utilize the components scoped in the Application.cfc file this way:

<code>
<cfset getCurrentBenchmarkReport = application.remarketingGateway.currentBenchmarkReport(Session.Login.Company.CustID, form.fromDate, #form.toDate#)>
</code>

We get the following error:

The routine QueryToCSV has been declared twice in different templates.

We know why this happens and have reverted back to the old way, but I thought it was an interesting issue you might like to test for yourself. Feel free to contact me for further details.

15,665 Comments

@Christopher,

That's a really odd error. I can't imagine why it is doing that if you are only calling methods on cached components. Very strange.

15,665 Comments

@Nikos,

You could default the Fields argument to be the column list from the passed-in query:

<cfargument
name="Fields"
required="false"
default="#arguments.query.columnList#"
/>

22 Comments

I ask because I usually just want to have a stored proc get specific data for a specific csv function and don't want to have to type out all the columns like this:

<cfreturn QueryToCSV(sp,"Client name,Client ID,Trade date,Due date,Transaction,Deposit / refund CCY,Deposit / refund amount")>

many thanks bro

15,665 Comments

@Josh,

You could use either the CFFile/Write tag or the fileWrite() method (depending on what version of ColdFusion you have).

22 Comments

Thanks mate, I done that but I get this error:

faultCode:Server.Processing faultString:'[Table (rows 30 columns CLIENT NAME, CLIENT ID, TRADE DATE, DUE DATE, TRANSACTION, 'DEPOSIT / REFUND CCY', 'DEPOSIT / REFUND AMOUNT'): [CLIENT NAME: coldfusion.sql.QueryColumn@629fe5] [CLIENT ID: coldfusion.sql.QueryColumn@ef9676] [TRADE DATE: coldfusion.sql.QueryColumn@17f30f2] [DUE DATE: coldfusion.sql.QueryColumn@1667e62] [TRANSACTION: coldfusion.sql.QueryColumn@1f361fb] ['DEPOSIT / REFUND CCY': coldfusion.sql.QueryColumn@1a5da07] ['DEPOSIT / REFUND AMOUNT': coldfusion.sql.QueryColumn@18d2964] ] is not indexable by arguments.query.columnList' faultDetail:'null'

22 Comments

@Ben Nadel,

Thanks but if I do that it orders it alphabetically, not the same order from the database. I think its a colfusion setting.

15,665 Comments

@Nikos,

Yes, the column list comes out of the query in alphabetical order. If you want to custom order them, just pass them in with the method call.

I am not sure what the order of the columns in the database have to do with anything? I am confused?

15,665 Comments

@Nikos,

Ah, gotcha. You can try using the undocumented method on the query object:

Query.GetColumnNames()

I believe this returns a list of columns in the original order of the select statement... but not 100% sure.

1 Comments

hi Ben,

This is what i want awesome,but i and some help from you regarding date values.

i am sending date column to from query to CV but when i open CSV that date field values point to time not date values in CSV . Will you please guide me how to change that to date.

Below is what chris posted on your blog about date values.

Will you please tell me where i can add this code to work with date values coming from query.

Thanks very much for this nice awesome code.

@chris for your comment
This is what I changed to make it work the way I wanted:

<cfset LOCAL.querydata = ARGUMENTS.Query[ LOCAL.ColumnNames[ LOCAL.ColumnIndex ] ][ ARGUMENTS.Query.CurrentRow ] >

<cfif isdate(LOCAL.querydata)>
<cfset LOCAL.querydata = dateformat(local.querydata,"mmm dd, yyyy")>
</cfif>

<!--- Add the field to the row data. --->
<cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#Replace( local.querydata, """", """""", "all" )#""" />

1 Comments

Hey guys,
I have recently been involved in a project which requires exporting data to CSV and originally used this UDF (it works great by the way!) but I would like to point out that in ColdFusion9 the cfspreadsheet tag can output a query to .XLS, CSV or HTML!

15,665 Comments

@CFNew,

It's been a while since I've done anything in Excel; but, I believe that if you are only seeing Time, it's actually a setting on the Excel column. Try looking at the formatting of the column cells to see if there is an option show date AND time.

@Joe,

Very cool! I still do most of my programming on ColdFusion 8 these days (I know, I know), so I haven't really explored the CFSpreadSheet tag too much. From the demos I've seen, it looks to be pretty badass.

1 Comments

Hi Ben,

why do you need to have so many double quotes when adding the field and field name to the row data?

-----------------------------------------
<cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#LOCAL.ColumnNames[ LOCAL.ColumnIndex ]#""" />

<cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#Replace( ARGUMENTS.Query[ LOCAL.ColumnNames[ LOCAL.ColumnIndex ] ][ ARGUMENTS.Query.CurrentRow ], """", """""", "all" )#""" />
-----------------------------------------

I am not getting that part. Also, I need to add " actually to my field values.

Thanks for that code really useful and sorry for digging out some old posts !

Cheers,
K.

2 Comments

Thanks for writing this. I added the following to make the fields argument optional.

<cfif isDefined( "arguments.fields" ) >
	<cfloop
		index="LOCAL.ColumnName"
		list="#ARGUMENTS.Fields#"
		delimiters=",">

		<!--- Store the current column name. --->
		<cfset ArrayAppend(
			LOCAL.ColumnNames,
			Trim( LOCAL.ColumnName )
			) />

	</cfloop>
<cfelse>
	<cfset LOCAL.ColumnNames = qresults.getMetaData().getColumnLabels() >
</cfif>
1 Comments

I am using this to export stored procedure data to a CSV and it works great! The only change I made was to check the CurrentRow against the RecordCount to break the loop. I was getting multiple newline chars at the end of the file.

<cfif (ARGUMENTS.Query.CurrentRow LTE ARGUMENTS.Query.RecordCount)>
<!--- Add the field to the row data. --->
<cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#Replace( ARGUMENTS.Query[ LOCAL.ColumnNames[ LOCAL.ColumnIndex ] ][ ARGUMENTS.Query.CurrentRow ], """", """""", "all" )#""" />
</cfif>

I also have a requirement to generate this as a read-only file. I have researched adding an attribute to the file, but have came up empty. Any suggestions on how to do this?

I believe in love. I believe in compassion. I believe in human rights. I believe that we can afford to give more of these gifts to the world around us because it costs us nothing to be decent and kind and understanding. And, I want you to know that when you land on this site, you are accepted for who you are, no matter how you identify, what truths you live, or whatever kind of goofy shit makes you feel alive! Rock on with your bad self!
Ben Nadel