Converting A ColdFusion Query To CSV Using QueryToCSV()

Posted May 19, 2008 at 9:04 AM by Ben Nadel

Tags: ColdFusion

UPDATE: I have posted an updated, faster more accurate version of this UDF here.

The other day, someone asked me about converting a ColdFusion query to a "comma separated values" file, or CSV. I was about to point them to my QueryToCSV() user defined function when I realized something crazy - I didn't have one. I have done a lot of work with converting CSV values into queries, but I guess I just never did as much going the other way as the task is significantly less complicated. However, in an effort to make future explanations easier, I have created one below:

  • <cffunction
  • name="QueryToCSV"
  • access="public"
  • returntype="string"
  • output="false"
  • hint="I take a query and convert it to a comma separated value string.">
  •  
  • <!--- Define arguments. --->
  • <cfargument
  • name="Query"
  • type="query"
  • required="true"
  • hint="I am the query being converted to CSV."
  • />
  •  
  • <cfargument
  • name="Fields"
  • type="string"
  • required="true"
  • hint="I am the list of query fields to be used when creating the CSV value."
  • />
  •  
  • <cfargument
  • name="CreateHeaderRow"
  • type="boolean"
  • required="false"
  • default="true"
  • hint="I flag whether or not to create a row of header values."
  • />
  •  
  • <cfargument
  • name="Delimiter"
  • type="string"
  • required="false"
  • default=","
  • hint="I am the field delimiter in the CSV value."
  • />
  •  
  • <!--- Define the local scope. --->
  • <cfset var LOCAL = {} />
  •  
  • <!---
  • First, we want to set up a column index so that we can
  • iterate over the column names faster than if we used a
  • standard list loop on the passed-in list.
  • --->
  • <cfset LOCAL.ColumnNames = {} />
  •  
  • <!---
  • Loop over column names and index them numerically. We
  • are going to be treating this struct almost as if it
  • were an array. The reason we are doing this is that
  • look-up times on a table are a bit faster than look
  • up times on an array (or so I have been told).
  • --->
  • <cfloop
  • index="LOCAL.ColumnName"
  • list="#ARGUMENTS.Fields#"
  • delimiters=",">
  •  
  • <!--- Store the current column name. --->
  • <cfset LOCAL.ColumnNames[ StructCount( LOCAL.ColumnNames ) + 1 ] = Trim( LOCAL.ColumnName ) />
  •  
  • </cfloop>
  •  
  • <!--- Store the column count. --->
  • <cfset LOCAL.ColumnCount = StructCount( LOCAL.ColumnNames ) />
  •  
  •  
  • <!---
  • Now that we have our index in place, let's create
  • a string buffer to help us build the CSV value more
  • effiently.
  • --->
  • <cfset LOCAL.Buffer = CreateObject( "java", "java.lang.StringBuffer" ).Init() />
  •  
  • <!--- Create a short hand for the new line characters. --->
  • <cfset LOCAL.NewLine = (Chr( 13 ) & Chr( 10 )) />
  •  
  •  
  • <!--- Check to see if we need to add a header row. --->
  • <cfif ARGUMENTS.CreateHeaderRow>
  •  
  • <!--- Loop over the column names. --->
  • <cfloop
  • index="LOCAL.ColumnIndex"
  • from="1"
  • to="#LOCAL.ColumnCount#"
  • step="1">
  •  
  • <!--- Append the field name. --->
  • <cfset LOCAL.Buffer.Append(
  • JavaCast(
  • "string",
  • """#LOCAL.ColumnNames[ LOCAL.ColumnIndex ]#"""
  • )
  • ) />
  •  
  • <!---
  • Check to see which delimiter we need to add:
  • field or line.
  • --->
  • <cfif (LOCAL.ColumnIndex LT LOCAL.ColumnCount)>
  •  
  • <!--- Field delimiter. --->
  • <cfset LOCAL.Buffer.Append(
  • JavaCast( "string", ARGUMENTS.Delimiter )
  • ) />
  •  
  • <cfelse>
  •  
  • <!--- Line delimiter. --->
  • <cfset LOCAL.Buffer.Append(
  • JavaCast( "string", LOCAL.NewLine )
  • ) />
  •  
  • </cfif>
  •  
  • </cfloop>
  •  
  • </cfif>
  •  
  •  
  • <!---
  • Now that we have dealt with any header value, let's
  • convert the query body to CSV. When doing this, we are
  • going to qualify each field value. This is done be
  • default since it will be much faster than actually
  • checking to see if a field needs to be qualified.
  • --->
  •  
  • <!--- Loop over the query. --->
  • <cfloop query="ARGUMENTS.Query">
  •  
  • <!--- Loop over the columns. --->
  • <cfloop
  • index="LOCAL.ColumnIndex"
  • from="1"
  • to="#LOCAL.ColumnCount#"
  • step="1">
  •  
  • <!--- Append the field value. --->
  • <cfset LOCAL.Buffer.Append(
  • JavaCast(
  • "string",
  • """#ARGUMENTS.Query[ LOCAL.ColumnNames[ LOCAL.ColumnIndex ] ][ ARGUMENTS.Query.CurrentRow ]#"""
  • )
  • ) />
  •  
  • <!---
  • Check to see which delimiter we need to add:
  • field or line.
  • --->
  • <cfif (LOCAL.ColumnIndex LT LOCAL.ColumnCount)>
  •  
  • <!--- Field delimiter. --->
  • <cfset LOCAL.Buffer.Append(
  • JavaCast( "string", ARGUMENTS.Delimiter )
  • ) />
  •  
  • <cfelse>
  •  
  • <!--- Line delimiter. --->
  • <cfset LOCAL.Buffer.Append(
  • JavaCast( "string", LOCAL.NewLine )
  • ) />
  •  
  • </cfif>
  •  
  • </cfloop>
  •  
  • </cfloop>
  •  
  •  
  • <!--- Return the CSV value. --->
  • <cfreturn LOCAL.Buffer.ToString() />
  • </cffunction>

As you can see, you have to pass in the query object and the list of field names that you want to output (in a given order). By default, the QueryToCSV() UDF assumes you want to create a header row and to use the comma as your field delimiter. When I create the CSV values, I am qualifying every field value with double quotes. While this is not necessary, it makes the process faster; if I had to evaluate each field to see if it required a field qualifier, the process would be much slower.

And, just a little sample code to see how easily this can be applied:

  • <!--- Create a new part query. --->
  • <cfset qPart = QueryNew(
  • "id, serial_number, price",
  • "cf_sql_integer, cf_sql_varchar, cf_sql_decimal"
  • ) />
  •  
  •  
  • <!--- Create some record. --->
  • <cfloop
  • index="intI"
  • from="1"
  • to="1000"
  • step="1">
  •  
  • <!--- Add a row to the query. --->
  • <cfset QueryAddRow( qPart ) />
  •  
  • <!--- Populate row with random data. --->
  • <cfset qPart[ "id" ][ qPart.RecordCount ] = JavaCast( "int", qPart.RecordCount ) />
  • <cfset qPart[ "serial_number" ][ qPart.RecordCount ] = JavaCast( "string", CreateUUID() ) />
  • <cfset qPart[ "price" ][ qPart.RecordCount ] = JavaCast( "float", RandRange( 1, 100 ) ) />
  •  
  • </cfloop>
  •  
  •  
  • <!--- Get the CSV value. --->
  • <cfset strOutput = QueryToCSV(
  • qPart,
  • "id, serial_number, price"
  • ) />

Not much to see here, this is mostly for future reference on how to take a ColdFusion query and quickly and easily convert it to a CSV value.




Reader Comments

May 19, 2008 at 9:51 AM // reply »
1 Comments

http://cfzen.instantspot.com/blog/2007/4/18/queryToCsv2-util-function-updated

;}


May 19, 2008 at 11:55 AM // reply »
66 Comments

I thought it might be helpful to add the following to the end of your "little sample code" (assuming someone want's to make the .csv file available for download):

<cfheader name="content-disposition" value="attachment; filename=download.csv" />
<cfcontent type="text/plain" /><cfoutput>#strOutput#</cfoutput>

(Or, you could substitute "application/msexcel" for the content type and achieve the same thing.)


May 19, 2008 at 1:18 PM // reply »
1 Comments

I can't believe you went about it this way.
I guess knowing about underlying java techniques makes you overthink things, wich just aren't needed in coldfusion.

I think just using cfcontent and cfoutput is enough.
the columnnames can be done with as much as:
#listChangeDelims(query.columnnames,';',',')#

I realize this sometimes too, that Im just overthinking.


Jim
May 19, 2008 at 1:50 PM // reply »
16 Comments

Hi Ben,

I noticed that you store column values as a struct, as opposed to looping over the list. You stated that you do this for speed reasons. I found this not to be true.

I took your example code and re-wrote it to use lists instead of structs.

Running the example, there was a no difference between the lists and structs. However, I noticed a slight speed increase (2%) using the list method, when I increased the list size to 20.

--j


May 19, 2008 at 2:28 PM // reply »
32 Comments

I have been working on a similar issue lately and I have found that the arrayAppend/ArrayToList is SIGNIFICANTLY faster than the string buffer. This is especially noticeable when working on particularly large datasets--100k records or so.


May 20, 2008 at 12:11 PM // reply »
46 Comments

I have to agree with David. I also try to use Arrays whenever possible as I have found it faster than looping through lists or structures. I havent tested the whole array/list/struct looping in 8 and see the speed differences.


May 21, 2008 at 10:13 AM // reply »
41 Comments

@Steve
Don't forget to include ";charset=utf-8" (CF's default output mode) to support non-ascii characters! Or else whatever charset you specified in <cfprocessingdirective> if you did so.

<cfcontent type="text/plain; charset=utf-8">


May 21, 2008 at 10:42 AM // reply »
11,235 Comments

@All,

I am a bit shocked that looping over a list is faster than looping over a collection. I didn't actually test the speed, but maybe you guys are right. I will run some tests on this.

As far as the use of Arrays, Jim did demonstrate to me that using an Array for each row, then appending that to the StringBuffer was very performant.

@David,

Again, I would be shocked if using just arrays for such large records would be faster on such huge files.

I think I need to dig into this a bit more. Thanks for the great feedback.


May 21, 2008 at 2:03 PM // reply »
41 Comments

@Jim & @Ben
I did some tests of my own to compare various loop types for performance.

In short: <cfloop array> is the fastest object loop I could find, with <cfloop list> being a close second. <cfloop collection> definitely didn't perform as well as these. I talk a bit about why I think this is for each type of loop over on my blog.

http://www.bandeblog.com/2008/05/relative-cfloop-performance-for-various-loop-structures/


May 21, 2008 at 2:09 PM // reply »
11,235 Comments

@Eric,

Wow. I am a bit shocked at the results.


May 22, 2008 at 9:00 AM // reply »
66 Comments

@Eric,
Good point on including utf-8 for encoding ... also thanks for sharing the loop performance info.


May 22, 2008 at 9:04 AM // reply »
8 Comments

Hi guys, thanks for share experience, I would share the way I do CSV. Im using SQL query to create CSV line like:

SELECT (
'"' + Data1 + '",'
+ '"' + Data2 + '",'
+ '"' + Data3 + '",'
+ '"' + Data4 + '"' ) as Line
FROM ......

and then just put result into the file:

<cfset filecontent = ValueList(q_csv.line,"#CHR(13)#")>
<cffile action="write" file="#fName#" output="#filecontent#" addnewline="yes">


May 22, 2008 at 9:25 AM // reply »
32 Comments

@Misha

This a great solution--assuming the data is coming from a DB table AND you know ahead of time which fields you'll be pulling.

In my case, I'm actually using a Text Driver DSN to read in a CSV with commas, then create a new version of the file that is double pipe delimited (||). Each file could have a different set of columns. . .


May 22, 2008 at 9:30 AM // reply »
8 Comments

Devid. yes, but you always can convert all data into the string like:

convert(nvarchar(50),qty)

and put some NULL validation as well like:

convert(nvarchar(50),ISNULL(SUM(QtyOrdered),0))

ISNULL(data1,'')


May 22, 2008 at 9:39 AM // reply »
32 Comments

@Misha

I should have added before that my process of replicating and then modifying a CSV file would probably be best handled OUTSIDE of CF altogether but I'm working on an existing process.

I'm not sure whether that last bit on converting the data and setting nulls would work as my goal is to avoid, as much as possible, changing the data that my user has placed in the original--but I will look into it. I'm also not sure that, in my case, this will relieve CF at all, since I'm assuming that the driver for reading the original file isn't relying on the SQL server in any way.


May 22, 2008 at 9:50 AM // reply »
11,235 Comments

@Misha,

Very cool solution. I would have never thought of that.


May 22, 2008 at 10:43 AM // reply »
41 Comments

@Misha
Don't forget you have to escape any columns which might contain double quotes or line feeds.


May 22, 2008 at 11:09 AM // reply »
8 Comments

Eric. From that point regardless what method you use to create CSV you should care about.


May 22, 2008 at 11:16 AM // reply »
41 Comments

I don't disagree =)


May 27, 2008 at 8:11 AM // reply »
11,235 Comments

Based on the feedback I have gotten, I created an updated version of of the QueryToCSV() method here:

http://www.bennadel.com/index.cfm?dax=blog:1239.view

@Eric,

Thanks for catching the whole escaping quotes point. I totally forgot to do that in my original version of the method. This has been fixed in the updated version. Dynamite catch :)


Mar 23, 2009 at 1:14 AM // reply »
1 Comments

Hello,

I was wondering if you had som more examples of on how to use your QuerytoCSV calling a stored procedure or som actual queries.

Thanks in advance. I like what you have created.


Mar 23, 2009 at 8:24 AM // reply »
11,235 Comments

@SDaniel,

As this is a ColdFusion function, it cannot be called from a stored procedure. Also, there is no difference between a manually constructed query as I have and query returned from the database in terms of how the QueryToCSV() method would be called.


Jul 21, 2009 at 10:22 AM // reply »
1 Comments

I think this function would be better if the fields argument was not required.

If not given the function uses all fields.


Jul 21, 2009 at 6:20 PM // reply »
11,235 Comments

@Anthony,

You could set the fields arguments to be optional and then give it a default value of #arguments.query.columnList#. Of course, then you'd have to be sure to use named-arguments rather than ordered ones.


Jun 8, 2010 at 5:07 PM // reply »
9 Comments

For several years now, I have been using a solution similar to Misha's for creating a CSV file from a dynamic list of fields (concatenating all the columns in a sql query).

Unfortunately, some of the list of fields have become very large (300+ columns) and I am getting a SQL server error "The query processor ran out of stack space during query optimization." In one case, eliminating a single concatenation allows the query to run.

I have searched but have not found any suggestions on how I might increase the stack space to eliminate the issue. Most of the solutions involve rewriting the query.

I am therefore going to have to implement a new solution, and hope that Ben's excellent approach will work.

Has anyone tried it with very large data sets (columns and rows)?


Jun 9, 2010 at 9:47 AM // reply »
11,235 Comments

@Tom,

I have never tried it with anything even approaching 300 columns. Good luck!


Dec 1, 2010 at 9:45 AM // reply »
12 Comments

@Ben

You could make an arg for the field delim, and set your preferred delim as the default. Then if someone wanted no delims they could pass delim="". Should be just as fast to do """ as #delim#, right?


Dec 1, 2010 at 10:49 PM // reply »
11,235 Comments

@Grant,

My good man, there is already an argument for a field delimiter that defaults to comma - this can be overridden with the method is invoked. Unless you're referring to the line delimiter (new line, etc.).


Dec 2, 2010 at 12:29 AM // reply »
12 Comments

@Ben,

Actually, I meant the column escape character (triple quotes). Thought there was a comment about qualifier being faster to "just do". But I may have misunderstood the comment.

Apologies about the wrong terminology. Loooong couple of days and hit enter too soon.


Dec 5, 2010 at 1:59 PM // reply »
11,235 Comments

@Grant,

Ahh, I see what you're talking about. That's an interesting idea.


Aug 23, 2011 at 10:11 PM // reply »
2 Comments

@Misha

I tried your:

SELECT (
'"' + Data1 + '",'
+ '"' + Data2 + '",'
+ '"' + Data3 + '",'
+ '"' + Data4 + '"' ) as Line
FROM ......

Unfortunately I'm using MYSQL, using a different concat operator.


Sep 27, 2012 at 11:18 AM // reply »
2 Comments

@vlad

I'm using MySql 5.5 & this works fine:

SELECT (CONCAT(
tbltollfree.tfNumber, ",1,",
tbltollfree.tfRingTo)) as Line
FROM tbltollfree
ORDER BY tbltollfree.tfNumber

Question to Misha or all?
- How do I "Append" some header & footer text to this query?
- Header1 = ;NVFILE 2.0
- header2 = DN,W1,DN1
- Footer = ;NVFILE END

Example of output would be:

;NVFILE 2.0
DN,W1,DN1
8002081111,1,1234567890
8002251112,1,1234567891
8002211113,1,1234567892
8002311114,1,1234567893
;NVFILE END



Post A Comment

Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.

Please review the following issues:

Author Name:


Author Email:

Author Website:

Comment:

Supported HTML tags for formatting: <strong>bold</strong>   <em>italic</em>   <code>code</code>







  • Help Wanted - Find Your Next ColdFusion Job
Ben Nadel's Company - Epicenter Consulting Recent Blog Comments
May 17, 2013 at 7:42 PM
HashKeyCopier - An AngularJS Utility Class For Merging Cached And Live Data
Ben - thanks so much for posting these Angular articles and findings, they've been a huge help towards learning one of the more 'complex' JavaScript frameworks out there (IMO). I have been using Angu ... read »
May 16, 2013 at 5:01 PM
UPDATE: Parsing CSV Data Files In ColdFusion With csvToArray()
Your code was the closest thing I've found to obtaining some direction for converting ISO fields to values that CF can translate properly. Thank you for posting! ... read »
May 15, 2013 at 10:37 PM
Very Simple Pusher And ColdFusion Powered Chat
hi id making plz easy ... read »
May 15, 2013 at 6:07 PM
Making SOAP Web Service Requests With ColdFusion And CFHTTP
Ben, you once again saved my bacon at work. Thank you, thank you, thank you! ... read »
May 15, 2013 at 4:15 PM
What If All User Interface (UI) Data Came In Reports?
@Josh, Thanks! @Ben, I definitely recommend the David West book "Object Thinking" I've been quoting from. It goes deeply into the philosophy and history of OO programming. His breadth ... read »
May 15, 2013 at 11:36 AM
Ask Ben: Print Part Of A Web Page With jQuery
I found this helpfull when you need to keep (refresh) the original parent page after closing the iframe child print dialog (Hoping you're not using a form at this time so it won't submit again): On ... read »
May 14, 2013 at 7:13 PM
What If All User Interface (UI) Data Came In Reports?
@Jonah, If there's any books you'd recommend on the subject of domain modelling, I'd love to hear it. I just downloaded the free PDF of "Domain Driven Design Quickly". Figured I'd give it ... read »
May 14, 2013 at 6:57 PM
The UX Of Prototyping: Low-Fidelity Is The New High-Fidelity
@Phillip, I'm not sure I follow what you mean? Are you saying that you looked at the list of widgets provided by the jQuery UI and let that be your style guide? ... read »
InVision App - Prototyping Made Beautiful With Prototyping Tools