UPDATE: Parsing CSV Data Files In ColdFusion With csvToArray()

Posted October 22, 2010 at 5:37 PM by Ben Nadel

Tags: ColdFusion

A couple of years ago, with the help of Steven Levithan, I wrote a user defined function (UDF) that could parse Comma-Separated-Value (CSV) data in ColdFusion. This UDF worked great, unless the very first field in your CSV data set was blank. In that case, it would be skipped over. Now, years later, I am finally getting around to fixing this problem.

To allow for your CSV data to begin with a blank field, I essentially switched the order in which the delimiter was checked. Previously, each field "match" was a delimiter followed by a field value. Now, it's a field value followed by a delimiter. This switched moved the boundary problem from the beginning to the end. But, once this switch was made, I simply had to add one special check for the empty delimiter (end of data) and break out of my parsing loop to avoid adding an extra, empty value.

  • <cffunction
  • name="csvToArray"
  • access="public"
  • returntype="array"
  • output="false"
  • hint="I take a CSV file or CSV data value and convert it to an array of arrays based on the given field delimiter. Line delimiter is assumed to be new line / carriage return related.">
  •  
  • <!--- Define arguments. --->
  • <cfargument
  • name="file"
  • type="string"
  • required="false"
  • default=""
  • hint="I am the optional file containing the CSV data."
  • />
  •  
  • <cfargument
  • name="csv"
  • type="string"
  • required="false"
  • default=""
  • hint="I am the CSV text data (if the file argument was not used)."
  • />
  •  
  • <cfargument
  • name="delimiter"
  • type="string"
  • required="false"
  • default=","
  • hint="I am the field delimiter (line delimiter is assumed to be new line / carriage return)."
  • />
  •  
  • <cfargument
  • name="trim"
  • type="boolean"
  • required="false"
  • default="true"
  • hint="I flags whether or not to trim the END of the file for line breaks and carriage returns."
  • />
  •  
  • <!--- Define the local scope. --->
  • <cfset var local = {} />
  •  
  • <!---
  • Check to see if we are using a CSV File. If so, then all we
  • want to do is move the file data into the CSV variable. That
  • way, the rest of the algorithm can be uniform.
  • --->
  • <cfif len( arguments.file )>
  •  
  • <!--- Read the file into Data. --->
  • <cfset arguments.csv = fileRead( arguments.file ) />
  •  
  • </cfif>
  •  
  • <!---
  • ASSERT: At this point, no matter how the data was passed in,
  • we now have it in the CSV variable.
  • --->
  •  
  • <!---
  • Check to see if we need to trim the data. Be default, we are
  • going to pull off any new line and carraige returns that are
  • at the end of the file (we do NOT want to strip spaces or
  • tabs as those are field delimiters).
  • --->
  • <cfif arguments.trim>
  •  
  • <!--- Remove trailing line breaks and carriage returns. --->
  • <cfset arguments.csv = reReplace(
  • arguments.csv,
  • "[\r\n]+$",
  • "",
  • "all"
  • ) />
  •  
  • </cfif>
  •  
  • <!--- Make sure the delimiter is just one character. --->
  • <cfif (len( arguments.delimiter ) neq 1)>
  •  
  • <!--- Set the default delimiter value. --->
  • <cfset arguments.delimiter = "," />
  •  
  • </cfif>
  •  
  •  
  • <!---
  • Now, let's define the pattern for parsing the CSV data. We
  • are going to use verbose regular expression since this is a
  • rather complicated pattern.
  •  
  • NOTE: We are using the verbose flag such that we can use
  • white space in our regex for readability.
  • --->
  • <cfsavecontent variable="local.regEx">(?x)
  • <cfoutput>
  •  
  • <!--- Make sure we pick up where we left off. --->
  • \G
  •  
  • <!---
  • We are going to start off with a field value since
  • the first thing in our file should be a field (or a
  • completely empty file).
  • --->
  • (?:
  •  
  • <!--- Quoted value - GROUP 1 --->
  • "([^"]*+ (?>""[^"]*+)* )"
  •  
  • |
  •  
  • <!--- Standard field value - GROUP 2 --->
  • ([^"\#arguments.delimiter#\r\n]*+)
  •  
  • )
  •  
  • <!--- Delimiter - GROUP 3 --->
  • (
  • \#arguments.delimiter# |
  • \r\n? |
  • \n |
  • $
  • )
  •  
  • </cfoutput>
  • </cfsavecontent>
  •  
  • <!---
  • Create a compiled Java regular expression pattern object
  • for the experssion that will be parsing the CSV.
  • --->
  • <cfset local.pattern = createObject(
  • "java",
  • "java.util.regex.Pattern"
  • ).compile(
  • javaCast( "string", local.regEx )
  • )
  • />
  •  
  • <!---
  • Now, get the pattern matcher for our target text (the CSV
  • data). This will allows us to iterate over all the tokens
  • in the CSV data for individual evaluation.
  • --->
  • <cfset local.matcher = local.pattern.matcher(
  • javaCast( "string", arguments.csv )
  • ) />
  •  
  •  
  • <!---
  • Create an array to hold the CSV data. We are going to create
  • an array of arrays in which each nested array represents a
  • row in the CSV data file. We are going to start off the CSV
  • data with a single row.
  •  
  • NOTE: It is impossible to differentiate an empty dataset from
  • a dataset that has one empty row. As such, we will always
  • have at least one row in our result.
  • --->
  • <cfset local.csvData = [ [] ] />
  •  
  • <!---
  • Here's where the magic is taking place; we are going to use
  • the Java pattern matcher to iterate over each of the CSV data
  • fields using the regular expression we defined above.
  •  
  • Each match will have at least the field value and possibly an
  • optional trailing delimiter.
  • --->
  • <cfloop condition="local.matcher.find()">
  •  
  • <!---
  • Next, try to get the qualified field value. If the field
  • was not qualified, this value will be null.
  • --->
  • <cfset local.fieldValue = local.matcher.group(
  • javaCast( "int", 1 )
  • ) />
  •  
  • <!---
  • Check to see if the value exists in the local scope. If
  • it doesn't exist, then we want the non-qualified field.
  • If it does exist, then we want to replace any escaped,
  • embedded quotes.
  • --->
  • <cfif structKeyExists( local, "fieldValue" )>
  •  
  • <!---
  • The qualified field was found. Replace escpaed
  • quotes (two double quotes in a row) with an unescaped
  • double quote.
  • --->
  • <cfset local.fieldValue = replace(
  • local.fieldValue,
  • """""",
  • """",
  • "all"
  • ) />
  •  
  • <cfelse>
  •  
  • <!---
  • No qualified field value was found; as such, let's
  • use the non-qualified field value.
  • --->
  • <cfset local.fieldValue = local.matcher.group(
  • javaCast( "int", 2 )
  • ) />
  •  
  • </cfif>
  •  
  • <!---
  • Now that we have our parsed field value, let's add it to
  • the most recently created CSV row collection.
  • --->
  • <cfset arrayAppend(
  • local.csvData[ arrayLen( local.csvData ) ],
  • local.fieldValue
  • ) />
  •  
  • <!---
  • Get the delimiter. We know that the delimiter will always
  • be matched, but in the case that it matched the end of
  • the CSV string, it will not have a length.
  • --->
  • <cfset local.delimiter = local.matcher.group(
  • javaCast( "int", 3 )
  • ) />
  •  
  • <!---
  • Check to see if we found a delimiter that is not the
  • field delimiter (end-of-file delimiter will not have
  • a length). If this is the case, then our delimiter is the
  • line delimiter. Add a new data array to the CSV
  • data collection.
  • --->
  • <cfif (
  • len( local.delimiter ) &&
  • (local.delimiter neq arguments.delimiter)
  • )>
  •  
  • <!--- Start new row data array. --->
  • <cfset arrayAppend(
  • local.csvData,
  • arrayNew( 1 )
  • ) />
  •  
  • <!--- Check to see if there is no delimiter length. --->
  • <cfelseif !len( local.delimiter )>
  •  
  • <!---
  • If our delimiter has no length, it means that we
  • reached the end of the CSV data. Let's explicitly
  • break out of the loop otherwise we'll get an extra
  • empty space.
  • --->
  • <cfbreak />
  •  
  • </cfif>
  •  
  • </cfloop>
  •  
  •  
  • <!---
  • At this point, our array should contain the parsed contents
  • of the CSV value as an array of arrays. Return the array.
  • --->
  • <cfreturn local.csvData />
  • </cffunction>

The interface to this version of the UDF is the same as it was before; the only difference is in how it works internally. You should be able to use this in-place of the old one without any problems.

To make sure this was working properly, I set up a few outlier test cases:

  • <h2>
  • Empty String
  • </h2>
  •  
  • <!--- Test the empty string. --->
  • <cfset result = csvToArray( csv = "" ) />
  •  
  • <!--- Output the results. --->
  • <cfdump
  • var="#result#"
  • label="Empty String"
  • />
  •  
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  •  
  • <h2>
  • All Empty Fields
  • </h2>
  •  
  • <!--- This has two rows of four empty fields. --->
  • <cfsavecontent variable="csvData">
  • ,,,
  • ,,,
  • </cfsavecontent>
  •  
  • <!--- Parse the test data. --->
  • <cfset result = csvToArray(
  • csv = trim( csvData )
  • ) />
  •  
  • <!--- Output the results. --->
  • <cfdump
  • var="#result#"
  • label="All Empty Fields"
  • />
  •  
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  •  
  • <h2>
  • A Single Value
  • </h2>
  •  
  • <!--- This has just a single row with single value. --->
  • <cfsavecontent variable="csvData">
  • Jill
  • </cfsavecontent>
  •  
  • <!--- Parse the test data. --->
  • <cfset result = csvToArray(
  • csv = trim( csvData )
  • ) />
  •  
  • <!--- Output the results. --->
  • <cfdump
  • var="#result#"
  • label="Single Value"
  • />
  •  
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  •  
  • <h3>
  • Non-Qualified Fields
  • </h3>
  •  
  • <!--- This has non-qualified fields. --->
  • <cfsavecontent variable="csvData">
  • Name,Age,Hair
  • Sarah,,Brunette
  • Jill,29,
  • </cfsavecontent>
  •  
  • <!--- Parse the test data. --->
  • <cfset result = csvToArray(
  • csv = trim( csvData )
  • ) />
  •  
  • <!--- Output the results. --->
  • <cfdump
  • var="#result#"
  • label="Non-Qualified Fields"
  • />
  •  
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  •  
  • <h3>
  • Qualified Fields
  • </h3>
  •  
  • <!--- This has non-qualified fields. --->
  • <cfsavecontent variable="csvData">
  • "Name","Age","Hair"
  • "Sarah ""Stubs"" Smith",29,
  • "Jill ""Sexy"" Jones",,Brunette
  • </cfsavecontent>
  •  
  • <!--- Parse the test data. --->
  • <cfset result = csvToArray(
  • csv = trim( csvData )
  • ) />
  •  
  • <!--- Output the results. --->
  • <cfdump
  • var="#result#"
  • label="Qualified Fields"
  • />
  •  
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  •  
  • <h3>
  • Pushing The Limits
  • </h3>
  •  
  • <!--- This has all kinds of nested characters fields. --->
  • <cfsavecontent variable="csvData">
  • ,"Name","Nick Name","Age","Hair Color"
  • Kim,"Kim ""Hot Legs"" Smith",24,"Brunette"
  • "Sarah Vivenz, II","Stubs",27,"Brunette"
  • "Kit Kat",Kitty,34,Blonde,,,
  • "Even
  • Values With
  • Embedded Line Breaks"
  • </cfsavecontent>
  •  
  • <!--- Parse the test data. --->
  • <cfset result = csvToArray(
  • csv = trim( csvData )
  • ) />
  •  
  • <!--- Output the results. --->
  • <cfdump
  • var="#result#"
  • label="Pushing The Limits"
  • />

When we run this code, we get the following page output:

 
 
 
 
 
 
CSV Data Being Parsed By ColdFusion. 
 
 
 

As you can see, this version of the UDF can now properly parse the CSV data when the first field is empty.




Reader Comments

Oct 22, 2010 at 11:17 PM // reply »
7 Comments

Thanks @ben! You rock!


Oct 23, 2010 at 3:54 PM // reply »
11,246 Comments

@Stacey,

No problem - happy to finally get this fixed... after several years :)


Oct 25, 2010 at 4:46 AM // reply »
13 Comments

Cool,
up until now I have always been using the cfx tag text2Query (http://www.emerle.net/old/programming/display.cfm/t/cfx_text2query/)

The only problem with using this it seems pretty slow when outputting the result through cfdump, but no problem looping over the results


Oct 25, 2010 at 8:42 AM // reply »
11,246 Comments

@Richard,

CFDump can be a beast on its own since it outputs so much HTML.


Nov 2, 2010 at 4:24 PM // reply »
11 Comments

Great post!

On a side note, I'm still really disappointed that the cfspreadsheet tag doesn't accept CSV files.


Nov 3, 2010 at 10:34 AM // reply »
11,246 Comments

@Brian,

Thank my man. I really need to play around with CFSpreadsheet some more (ie. at all). Since I use CF8 in production still, it's been hard for me to really get motivated to try all the new features of CF9. My blog, at the very least, will be moving to CF9 soon; so, hopefully, then I'll put more effort into CF9 studies.


Nov 16, 2010 at 1:47 PM // reply »
6 Comments

I am trying to parse RETS realty listings for a site I'm building. This is exactly what I needed. Works great until I found a record with a field that contained a room dimension as 7'10" x 10'8". Most of the records didn't use single and double quotes like this. Your code stopped on this field.
I have no quoted fields. I attempted to adjust the reg expressions with no success. Could you help me?
I can send you a sample file and my test code file.


Nov 16, 2010 at 4:40 PM // reply »
11 Comments

@Gene,

Without looking at your code I'd suggest the PreserveSingleQuotes() function.

http://livedocs.adobe.com/coldfusion/8/htmldocs/help.html?content=functions_m-r_14.html


Nov 16, 2010 at 7:09 PM // reply »
6 Comments

Thanks Brian. I tried it with no success. The output array has no more content (or cells) when it hits this field and the input record has several more fields. I'm thinking the regEx that's trying to evaluate it is failing.


Nov 17, 2010 at 2:25 PM // reply »
7 Comments

Ok, finally got around to actually using this new version and not my hacked version, but the following line has CF barking at me:

<cfset local.csvData = [ [] ] />

Invalid CFML construct found on line 401 at column 32.

ColdFusion was looking at the following text:
[

The CFML compiler was processing:

A cfset tag beginning on line 401, column 10.
A cfset tag beginning on line 401, column 10.

It's line 401 in my case sense I added this to an existing CFC I have.


Nov 18, 2010 at 10:58 AM // reply »
6 Comments

Stacey,
I'm using CF8 and got the same error initially. I commented the line out with no apparent issues. There were a couple other spots I needed to modify for CF8 also, but I didn't document them. My version is working for me, except for the change I am working on for the regEx noted in my previous post.


Nov 18, 2010 at 11:02 AM // reply »
11,246 Comments

@Gene,

How are your fields delimited? Tabs? Commas? I'll do some testing to see what I can come up with.

@Stacey,

That [[]] notation is an implicit array. I think you might need to be running 8.0.1 or higher for that to work. The nested arrays, I think, was an updater.


Nov 18, 2010 at 11:04 AM // reply »
7 Comments

@Gene,

I've just went back to the previous version. I had already hacked it to work, and am hacking it some more for some other cases I need to handle. Some of my data looks like:

"Sam Smith" <sam@smith.com>,
Jim Jones <jim@abc.com>,
"Jones, Bob" <bob@big.com>,
Sally Jo <sally@xyz.com>

And I need it to handle that kind of quoted case as well. I managed to do that by creating a new regex for that type of field value.


Nov 18, 2010 at 11:06 AM // reply »
7 Comments

@Ben,

Yep, running 8 here. :-( I did change how it creating the array, and got past that, but there were some other things too, so I just went back to my previous hacked version. Lie I told @gene, I had to make some other hacks too, so no big deal. I'll tell you though, this has been invaluable! Many kudos for your work on this!


Nov 18, 2010 at 11:10 AM // reply »
11,246 Comments

@Stacey,

No problem. Sorry that this one threw a curve-ball at you :)


Nov 18, 2010 at 11:14 AM // reply »
11,246 Comments

@Gene,

What are they using to produce the CSV data? If I remember correctly, this UDF was built on the CSV standard, which suggests that all fields with embedded special characters are themselves wrapped in double-quotes. Are they manually creating the data?

I can probably help you with the parsing, but I may be hesitant to put that kind of logic in the UDF as it appears to be non-standard.


Nov 18, 2010 at 12:11 PM // reply »
6 Comments

Ben,
My CSV file is tabbed and no, it doesn't double quote anything. I suspect I am working with an unusual circumstance and wouldn't suggest including it in the UDF.
I planned on dropping the test for double quoted fields to simplify it for my situation and then adjust the regEx to catch the example I listed in my previous post. If you have time to look at it that would be great! I haven't done much with regEx.
Also, I am putting the data in a SQL database. I have the data types and lengths of the originating data. When I try to insert a field that is integer into the table I get a data type mismatch until I change the table column's type to varchar. I suspect I need to pull the values that are integer out of the array and assign them to a variable param'd as integer type to make the insertion work. I'm looking at around 75 fields with maybe 10 that are integer. Does this sound correct to you?


Nov 18, 2010 at 12:23 PM // reply »
11,246 Comments

@Gene,

If your fields are tab-delimited, you might make your life easier by using something like:

listToArray( data, "#chr( 13 )##chr( 10 )#" )

... to break out rows and then:

listToArray( rowData, chr( 9 ) )

... to break out cells for each row. Assuming that none of the fields have embedded line breaks or tabs, treating the CSV data as "nested lists" is actually gonna be way faster than parsing with RegEx (I would guess).

Is that possible, do you think?


Nov 18, 2010 at 12:48 PM // reply »
6 Comments

Ben,
I had decided to cfloop over the file so I am dealing with just one row at a time. I do have many empty cell though. Wouldn't I loose the empty cells with your suggestion?

I just finished a discussion with a friend that suggested I look at CFHTTP and converting to a query instead. His thought was to put "[NULL]" (or something) in the empty fields before running CFHTTP. I will look into that option.

Of course putting something in the empty cells first would allow me to do listToArray( rowData, chr( 9 ) ) as well.


Nov 18, 2010 at 1:16 PM // reply »
11,246 Comments

@Gene,

Ah, good point. For something like that, it might be better to dip into the Java layer to do some sort of split-type approach, which works a bit better with empty values... I think.

Let me play around with some ideas.


Nov 19, 2010 at 10:33 AM // reply »
11,246 Comments

@Gene,

First off, I totally forgot that listToArray() *can* handle empty values - there's an optional third argument as of ColdFusion 8.

That said, I put this together this morning - reSplit(). It's a splitting function that works off of regular expression patterns that can be used to define list delimiters:

http://www.bennadel.com/blog/2058-RESplit-Splitting-Strings-With-Regular-Expressions-In-ColdFusion.htm

That might help.


Nov 19, 2010 at 12:56 PM // reply »
6 Comments

Ben,
Your comment on listToArray()'s ability to handle empty values is right on. I have to confess I was still running CF7 on my workstation when I tried this some time ago with no luck. I upgraded now and tried it. I think it will solve my problem without using any regEx.
Thanks for your advice!


Nov 19, 2010 at 4:41 PM // reply »
11,246 Comments

@Gene,

Sounds groovy.


Jan 5, 2011 at 1:54 PM // reply »
4 Comments

I might be going around this in a really strange way, just trying to connect the dots between the few things i actually know how to do....

I want to use google spreadsheets as an easy to use/share database. I can take google's CSV output and convert it to an array with your instructions (awesome!). The next step is figuring out how to run a query of that array and output the values.

I was planning to take the array and convert to a query object (in the fashion of your XML to Query post.) But am hung up on 2 things:
1) I have this nagging feeling that I'm rigging up a way overly-complicated system;
2) I can't figure out how to identify the proper array values in this structure anyway.

Wisdom?


Jan 5, 2011 at 2:02 PM // reply »
4 Comments

@Jared,

I guess sometimes asking is all I need to do to clarify to myself what I need to be doing. I've got it, I've got it.

Ben, your posts are amazing. All the more amazing is the fact that people like me with really limited education/experience are able to pull off really cool stuff thanks to you!


Feb 4, 2011 at 10:54 AM // reply »
10 Comments

Hi Ben, and everyone,

I released a new CFML tag for Railo today, based on the code you published here! It's called <cfcsv>.
Read more at http://www.railodeveloper.com/post.cfm/railo-custom-tag-cfcsv

I added some features to your original code, like returning the parsed data as a query, and determining your own text qualifier. You can see the new code here: http://www.railodeveloper.com/svn.cfm?repositorypath=CFCSV/cfcsv/cfcsv.cfc%3AHEAD

Thanks for writing this code, much appreciated :-)

Paul


Feb 9, 2011 at 4:05 PM // reply »
3 Comments

I had to perform a similar task on a project couple yrs back wherein users could upload a csv file (fields in any order) and map it to DB fields. I used Microsoft text driver to read the text file. Basically I created a datasource in CF Admin pointing to the folder containing the csv files.

and then used cfquery to read the csv files. There are some issues with how the MS text driver reads the columns - it reads the first few rows and makes assumptions about datatypes, length, etc.

You can specify a data-type/length using a schema.ini file.


Mar 14, 2011 at 10:25 PM // reply »
4 Comments

Ben, have you measured the impact on heap space? The speed is impressive, but I'm running into heap issues. You had mentioned some large datasets, but when I tried going over 10K records, I run out of heap when using CF8 Standard with a heap size of 768MB. When I monitor the heap as the test program runs, it just keeps climbing. Forcing a GC doesn't help. Subsetting the # of records sometimes helps. Any ideas?


Jul 8, 2011 at 5:56 AM // reply »
5 Comments

Thanks for this, Ben. Needed to import a CSV of browsers/user agents mappings. :-)


Jul 22, 2011 at 10:00 AM // reply »
11,246 Comments

@Jared,

Ha ha, glad you got it sorted out yourself :) Sometimes, you just gotta say the question "out load" and it makes a lot more sense.

@Paul,

Very cool stuff!

@Dharm,

I've used the database driver approach a couple of years ago. I remember it being pretty awesome. We just kept overwriting a file and running the same query that would extract it. We had some problems with it, but I can't even remember what they were (it was like 5 years ago).

@Tom,

I can't say I've actually looked at any memory consumption, only at perceived speed. I suppose you could build a SAX-style parser for CSV files where, rather than XML events, the parser could raise "field" and "record" events. Actually, that could be kind of a fun little project to tinker with. That way, you could, theoretically, read a file of infinite size since there would never be more than N number of characters in memory at any given time.

I'll see if I can play around with that idea, for funzies. I want to get more into anything that is event-driven.

@Michael,

Awesome :)


Jul 22, 2011 at 10:19 AM // reply »
4 Comments

@Ben,
I'll think about the event driven approach as well. I've found his problem has more to do with when CF releases memory which I think you've raised before. I've even tried manually forcing garbage collection (a practice I don't like or recommend). Some has suggest running sections of code in threads which is supposed to cause CF to recover memory used by the thread. If I take the same amount of data and retrieve it using a query from the database, there is no problem. But try to read the data into a query from a CSV file, and you can watch memory getting consumed quickly.


Aug 16, 2011 at 11:17 AM // reply »
11,246 Comments

@Thomas,

This morning, I tried to create an evented parser for a CSV file using a Buffered Input Stream.

http://www.bennadel.com/blog/2241-Parsing-CSV-Data-With-An-Input-Stream-And-A-Finite-State-Machine.htm

I know this is from a while ago, but I thought you might find it interesting. I think state machines are pretty powerful. I'm trying to learn more about them.


Feb 15, 2012 at 2:41 PM // reply »
1 Comments

Hey Ben, I've been using this custom tag for a little while with great success, but now I'm moving to a shared server environment that does not allow use of Java objects... I know, right?

Any suggestions on how I can modify this to get rid of the Java object calls and functions? I'm not really sure how to translate the regex calls.

Any help would be greatly appreciated!


Mar 22, 2012 at 7:01 AM // reply »
3 Comments

Dear Ben,

your CSV2Array UDF ist very nice. The only problem I have is that it stops parsing after 9999 rows. Is there any way to make it parsing the whole CSV? Mine has around 14000 rows.

Thanks in advance and thx for this nice UDF.
Greetings
Bernhard


Mar 22, 2012 at 7:07 AM // reply »
3 Comments

Sorry, found the problem. cfdump stops at 9999. Your UDF is absolutely working. Thx again.


Apr 11, 2012 at 11:27 AM // reply »
1 Comments

Cool stuff. Very good. Thanks


Apr 24, 2012 at 1:06 PM // reply »
2 Comments

This is so useful! But I've encountered an odd issue. The import bombs when there are quotes within the delimited data. This is okay:

|"Test data"|

but this bombs:

|This "is test data"|

Any suggestions on how to deal with it? I don't want to do a global find/replace for quotes or items with the delimiter within the quotes will get screwy.


Sep 27, 2012 at 3:52 AM // reply »
8 Comments

G'day mate:
This is the first CSV parser that I've encountered that will handle any CSV-curveball that I can throw at it. Good stuff. Most people's implementations - including Adobe's and Microsoft Excel's don't appear to be based on the CSV "spec", and fail with things like embedded CRLFs and the like.

There are a coupla UDFs on CFLib that fall into this category (or worse), and I'd like to add an updated function which uses your approach, but also adds a couple of different return-format options. Would it be OK if I lift your technique as the engine room for my UDF? I'll credit you accordingly, obviously.

I just think having the UDF in the central place CF people look for UDFs is better than relying on people finding it in a blog post (although you've probably done it for the traffic, so I get why you've done what you have ;-)

Anyway, lemme know what you think, mate.

Cheers for all your hard work!

--
Adam


Jan 28, 2013 at 5:03 AM // reply »
3 Comments

Hello together,

I really can't tell how I enjoy this amazing function.
Now received a CSV file from a customer. The file is Windows ANSI not UTF-8 but ISO-8859-1. Notepad++ shows me a CR and LF at the end of each row. Delimiters are Tabs (chr(9)).
When I try to parse the file I get an empty array.

I tried to fix it and got it running with

<cfset arguments.csv = Replace(arguments.csv, chr(9), ";", "ALL") />
<cfset arguments.csv = ListToArray(arguments.csv, "#chr(13)##chr(10)#") />
<cfset arguments.delimiter = ";" />
<cfset var i = "" />
<cfloop index="i" from="1" to="#ArrayLen(arguments.csv)#">
<cfset arguments.csv[i] = ListToArray(arguments.csv[i], arguments.delimiter) />
</cfloop>
<cfset local.csvData = arguments.csv />

but wonder why Bens function delivers an empty array. I think I missed something.

Thanks again for this still awesome function.

Yours
Bernhard


May 16, 2013 at 5:01 PM // reply »
3 Comments

Your code was the closest thing I've found to obtaining some direction for converting ISO fields to values that CF can translate properly. Thank you for posting!



Post A Comment

Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.

Please review the following issues:

Author Name:


Author Email:

Author Website:

Comment:

Supported HTML tags for formatting: <strong>bold</strong>   <em>italic</em>   <code>code</code>







  • Help Wanted - Find Your Next ColdFusion Job
Ben Nadel's Company - Epicenter Consulting Recent Blog Comments
May 23, 2013 at 9:52 PM
Preventing Links In Standalone iPhone Applications From Opening In Mobile Safari
@Muhmmadibn Did you figure out a solution to launching PDFs? I am running into the same issues myself. There is no way to close the PDF or go back once you launch it. Thanks in advance! ... read »
May 23, 2013 at 6:06 PM
The Girl Who Broke My Heart, And Made Me A Better Person
Good day,ladies and gentle men, my name is Dr AMADI the great spell caster in Africa, i have help so many people for different kind of problems,who say there is no solution to problems on earth, that ... read »
May 23, 2013 at 4:26 PM
ColdFusion QueryAppend( qOne, qTwo )
@Heather, Glad people are still getting value out of this! ... read »
May 23, 2013 at 3:49 PM
Strange Interaction Between DeserializeJson(), ArrayContains(), And Database Values In ColdFusion
@WebManWalking, I meant the code at the bottom (not the video). I did try to experiment with an intermediary variable, like: value = users.id[ i ]; arrayContains( userIDs, value ); ... but t ... read »
May 23, 2013 at 11:06 AM
Strange Interaction Between DeserializeJson(), ArrayContains(), And Database Values In ColdFusion
@Ben, Are you talking about As Number: YES As String: YES As Java: YES? If so, that's with 3 different ways of referencing the constant 1, not users.id[1]. Query object references(*) are what seem ... read »
May 23, 2013 at 9:55 AM
Strange Interaction Between DeserializeJson(), ArrayContains(), And Database Values In ColdFusion
@Dan, According to the CF Admin, I'm running Java "1.6.0_45". As far as the DB column, in the database it's an INT. I'll see if I can dig into what CF sees it as. @WebManWalking, But h ... read »
May 23, 2013 at 9:49 AM
Strange Interaction Between DeserializeJson(), ArrayContains(), And Database Values In ColdFusion
@Ben, I think the problem is that we're used to loose typing in ColdFusion, like JavaScript. If a value is a number but it's needed in an expression to be a string, noooo problem. I've encountered ... read »
May 23, 2013 at 9:47 AM
ColdFusion QueryAppend( qOne, qTwo )
You rock! Thank you, thank you, thank you!!! ... read »
InVision App - Prototyping Made Beautiful With Prototyping Tools