Parsing CSV Values In ColdFusion While Handling Embedded Qualifiers And Delimiters

By Ben Nadel

Published 2007-01-30 in ColdFusion — Comments (10)

This is another attempt to parse CSV (comma separated values) values according to the CSV standard file format. What makes parsing CSV data so difficult is that the field and record delimiters may or may not be delimiters. If either delimiter is within a qualified field, then that character is just a character literal and NOT a functional delimiter. Along the same lines, qualifiers can be escaped and used as character literals within a qualified field.

This all makes for quite a headache. This ColdFusion user defined function (UDF), CSVToArray(), can handle all of that. It can handle embedded field (ex. comma) and record (ex. new line) delimiters as both delimiters and embedded characters. It does this be splitting the entire string on both delimiters and then looping over the tokens. This method has the advantage of not having to check individual characters, which should be faster, but it does have the downside of also having to keep track of the individual delimiters which can be slower.

Keeping track of the delimiters in their own array was a new approach for me. I basically stripped out all the non-delimiter characters from the original string and then got the character array of that string. This put each delimiter into its own array index. This was tremendously helpful because after each token value retrieval, I could easily check to see which delimiter came just before or just after it. While this adds overhead, without this information, I would NEVER have been able to gracefully handle embedded record delimiters.

Ok, so here it the ColdFusion UDF, CSVToArray(), in typical Kinky Solutions "robust commenting for teaching purposes" style:

  
          <cffunction
        
          	name="CSVToArray"
        
          	access="public"
        
          	returntype="array"
        
          	output="false"
        
          	hint="Converts the given CSV string to an array of arrays.">
        
          	<!--- Define arguments. --->
        
          	<cfargument
        
          		name="CSV"
        
          		type="string"
        
          		required="true"
        
          		hint="This is the CSV string that will be manipulated."
        
          		/>
        
          	<cfargument
        
          		name="Delimiter"
        
          		type="string"
        
          		required="false"
        
          		default=","
        
          		hint="This is the delimiter that will separate the fields within the CSV value."
        
          		/>
        
          	<cfargument
        
          		name="Qualifier"
        
          		type="string"
        
          		required="false"
        
          		default=""""
        
          		hint="This is the qualifier that will wrap around fields that have special characters embeded."
        
          		/>
        
          	<!--- Define the local scope. --->
        
          	<cfset var LOCAL = StructNew() />
        
          	<!---
        
          		When accepting delimiters, we only want to use the first
        
          		character that we were passed. This is different than
        
          		standard ColdFusion, but I am trying to make this as
        
          		easy as possible.
        
          	--->
        
          	<cfset ARGUMENTS.Delimiter = Left( ARGUMENTS.Delimiter, 1 ) />
        
          	<!---
        
          		When accepting the qualifier, we only want to accept the
        
          		first character returned. Is is possible that there is
        
          		no qualifier being used. In that case, we can just store
        
          		the empty string (leave as-is).
        
          	--->
        
          	<cfif Len( ARGUMENTS.Qualifier )>
        
          		<cfset ARGUMENTS.Qualifier = Left( ARGUMENTS.Qualifier, 1 ) />
        
          	</cfif>
        
          	<!---
        
          		Set a variable to handle the new line. This will be the
        
          		character that acts as the record delimiter.
        
          	--->
        
          	<cfset LOCAL.LineDelimiter = Chr( 10 ) />
        
          	<!---
        
          		We want to standardize the line breaks in our CSV value.
        
          		A "line break" might be a return followed by a feed or
        
          		just a line feed. We want to standardize it so that it
        
          		is just a line feed. That way, it is easy to check
        
          		for later (and it is a single character which makes our
        
          		life 1000 times nicer).
        
          	--->
        
          	<cfset ARGUMENTS.CSV = ARGUMENTS.CSV.ReplaceAll(
        
          		"\r?\n",
        
          		LOCAL.LineDelimiter
        
          		) />
        
          	<!---
        
          		Let's get an array of delimiters. We will need this when
        
          		we are going throuth the tokens and building up field
        
          		values. To do this, we are going to strip out all
        
          		characters that are NOT delimiters and then get the
        
          		character array of the string. This should put each
        
          		delimiter at it's own index.
        
          	--->
        
          	<cfset LOCAL.Delimiters = ARGUMENTS.CSV.ReplaceAll(
        
          		"[^\#ARGUMENTS.Delimiter#\#LOCAL.LineDelimiter#]+",
        
          		""
        
          		)
        
          		<!---
        
          			Get character array of delimiters. This will put
        
          			each found delimiter in its own index (that should
        
          			correspond to the tokens).
        
          		--->
        
          		.ToCharArray()
        
          		/>
        
          	<!---
        
          		Add a blank space to the beginning of every theoretical
        
          		field. This will help in make sure that ColdFusion /
        
          		Java does not skip over any fields simply because they
        
          		do not have a value. We just have to be sure to strip
        
          		out this space later on.
        
          		First, add a space to the beginning of the string.
        
          	--->
        
          	<cfset ARGUMENTS.CSV = (" " & ARGUMENTS.CSV) />
        
          	<!--- Now add the space to each field. --->
        
          	<cfset ARGUMENTS.CSV = ARGUMENTS.CSV.ReplaceAll(
        
          		"([\#ARGUMENTS.Delimiter#\#LOCAL.LineDelimiter#]{1})",
        
          		"$1 "
        
          		) />
        
          	<!---
        
          		Break the CSV value up into raw tokens. Going forward,
        
          		some of these tokens may be merged, but doing it this
        
          		way will help us iterate over them. When splitting the
        
          		string, add a space to each token first to ensure that
        
          		the split works properly.
        
          		BE CAREFUL! Splitting a string into an array using the
        
          		Java String::Split method does not create a COLDFUSION
        
          		ARRAY. You cannot alter this array once it has been
        
          		created. It can merely be referenced (read only).
        
          		We are splitting the CSV value based on the BOTH the
        
          		field delimiter and the line delimiter. We will handle
        
          		this later as we build values (this is why we created
        
          		the array of delimiters above).
        
          	--->
        
          	<cfset LOCAL.Tokens = ARGUMENTS.CSV.Split(
        
          		"[\#ARGUMENTS.Delimiter#\#LOCAL.LineDelimiter#]{1}"
        
          		) />
        
          	<!---
        
          		Set up the default return array. This will be a full
        
          		array of arrays, but for now, just create the parent
        
          		array with no indexes.
        
          	--->
        
          	<cfset LOCAL.Return = ArrayNew( 1 ) />
        
          	<!---
        
          		Create a new active row. Even if we don't end up adding
        
          		any values to this row, it is going to make our lives
        
          		more smiple to have it in existence.
        
          	--->
        
          	<cfset ArrayAppend(
        
          		LOCAL.Return,
        
          		ArrayNew( 1 )
        
          		) />
        
          	<!---
        
          		Set up the row index. THis is the row to which we are
        
          		actively adding value.
        
          	--->
        
          	<cfset LOCAL.RowIndex = 1 />
        
          	<!---
        
          		Set the default flag for wether or not we are in the
        
          		middle of building a value across raw tokens.
        
          	--->
        
          	<cfset LOCAL.IsInValue = false />
        
          	<!---
        
          		Loop over the raw tokens to start building values. We
        
          		have no sense of any row delimiters yet. Those will
        
          		have to be checked for as we are building up each value.
        
          	--->
        
          	<cfloop
        
          		index="LOCAL.TokenIndex"
        
          		from="1"
        
          		to="#ArrayLen( LOCAL.Tokens )#"
        
          		step="1">
        
          		<!---
        
          			Get the current field index. This is the current
        
          			index of the array to which we might be appending
        
          			values (for a multi-token value).
        
          		--->
        
          		<cfset LOCAL.FieldIndex = ArrayLen(
        
          			LOCAL.Return[ LOCAL.RowIndex ]
        
          			) />
        
          		<!---
        
          			Get the next token. Trim off the first character
        
          			which is the empty string that we added to ensure
        
          			proper splitting.
        
          		--->
        
          		<cfset LOCAL.Token = LOCAL.Tokens[ LOCAL.TokenIndex ].ReplaceFirst(
        
          			"^.{1}",
        
          			""
        
          			) />
        
          		<!---
        
          			Check to see if we have a field qualifier. If we do,
        
          			then we might have to build the value across
        
          			multiple fields. If we do not, then the raw tokens
        
          			should line up perfectly with the real tokens.
        
          		--->
        
          		<cfif Len( ARGUMENTS.Qualifier )>
        
          			<!---
        
          				Check to see if we are currently building a
        
          				field value that has been split up among
        
          				different delimiters.
        
          			--->
        
          			<cfif LOCAL.IsInValue>
        
          				<!---
        
          					ASSERT: Since we are in the middle of
        
          					building up a value across tokens, we can
        
          					assume that our parent FOR loop has already
        
          					executed at least once. Therefore, we can
        
          					assume that we have a previous token value
        
          					ALREADY in the row value array and that we
        
          					have access to a previous delimiter (in
        
          					our delimiter array).
        
          				--->
        
          				<!---
        
          					Since we are in the middle of building a
        
          					value, we replace out double qualifiers with
        
          					a constant. We don't care about the first
        
          					qualifier as it can ONLY be an escaped
        
          					qualifier (not a field qualifier).
        
          				--->
        
          				<cfset LOCAL.Token = LOCAL.Token.ReplaceAll(
        
          					"\#ARGUMENTS.Qualifier#{2}",
        
          					"{QUALIFIER}"
        
          					) />
        
          				<!---
        
          					Add the token to the value we are building.
        
          					While this is not easy to read, add it
        
          					directly to the results array as this will
        
          					allow us to forget about it later. Be sure
        
          					to add the PREVIOUS delimiter since it is
        
          					actually an embedded delimiter character
        
          					(part of the whole field value).
        
          				--->
        
          				<cfset LOCAL.Return[ LOCAL.RowIndex ][ LOCAL.FieldIndex ] = (
        
          					LOCAL.Return[ LOCAL.RowIndex ][ LOCAL.FieldIndex ] &
        
          					LOCAL.Delimiters[ LOCAL.TokenIndex - 1 ] &
        
          					LOCAL.Token
        
          					) />
        
          				<!---
        
          					Now that we have removed the possibly
        
          					escaped qualifiers, let's check to see if
        
          					this field is ending a multi-token
        
          					qualified value (its last character is a
        
          					field qualifier).
        
          				--->
        
          				<cfif (Right( LOCAL.Token, 1 ) EQ ARGUMENTS.Qualifier)>
        
          					<!---
        
          						Wooohoo! We have reached the end of a
        
          						qualified value. We can complete this
        
          						value and move onto the next field.
        
          						Remove the trailing quote.
        
          						Remember, we have already added to token
        
          						to the results array so we must now
        
          						manipulate the results array directly.
        
          						Any changes made to LOCAL.Token at this
        
          						point will not affect the results.
        
          					--->
        
          					<cfset LOCAL.Return[ LOCAL.RowIndex ][ LOCAL.FieldIndex ] = LOCAL.Return[ LOCAL.RowIndex ][ LOCAL.FieldIndex ].ReplaceFirst( ".{1}$", "" ) />
        
          					<!---
        
          						Set the flag to indicate that we are no
        
          						longer building a field value across
        
          						tokens.
        
          					--->
        
          					<cfset LOCAL.IsInValue = false />
        
          				</cfif>
        
          			<cfelse>
        
          				<!---
        
          					We are NOT in the middle of building a field
        
          					value which means that we have to be careful
        
          					of a few special token cases:
        
          					1. The field is qualified on both ends.
        
          					2. The field is qualified on the start end.
        
          				--->
        
          				<!---
        
          					Check to see if the beginning of the field
        
          					is qualified. If that is the case then either
        
          					this field is starting a multi-token value OR
        
          					this field has a completely qualified value.
        
          				--->
        
          				<cfif (Left( LOCAL.Token, 1 ) EQ ARGUMENTS.Qualifier)>
        
          					<!---
        
          						Delete the first character of the token.
        
          						This is the field qualifier and we do
        
          						NOT want to include it in the final value.
        
          					--->
        
          					<cfset LOCAL.Token = LOCAL.Token.ReplaceFirst(
        
          						"^.{1}",
        
          						""
        
          						) />
        
          					<!---
        
          						Remove all double qualifiers so that we
        
          						can test to see if the field has a
        
          						closing qualifier.
        
          					--->
        
          					<cfset LOCAL.Token = LOCAL.Token.ReplaceAll(
        
          						"\#ARGUMENTS.Qualifier#{2}",
        
          						"{QUALIFIER}"
        
          						) />
        
          					<!---
        
          						Check to see if this field is a
        
          						self-closer. If the first character is a
        
          						qualifier (already established) and the
        
          						last character is also a qualifier (what
        
          						we are about to test for), then this
        
          						token is a fully qualified value.
        
          					--->
        
          					<cfif (Right( LOCAL.Token, 1 ) EQ ARGUMENTS.Qualifier)>
        
          						<!---
        
          							This token is fully qualified.
        
          							Remove the end field qualifier and
        
          							append it to the row data.
        
          						--->
        
          						<cfset ArrayAppend(
        
          							LOCAL.Return[ LOCAL.RowIndex ],
        
          							LOCAL.Token.ReplaceFirst(
        
          								".{1}$",
        
          								""
        
          								)
        
          							) />
        
          					<cfelse>
        
          						<!---
        
          							This token is not fully qualified
        
          							(but the first character was a
        
          							qualifier). We are buildling a value
        
          							up across differen tokens. Set the
        
          							flag for building the value.
        
          						--->
        
          						<cfset LOCAL.IsInValue = true />
        
          						<!--- Add this token to the row. --->
        
          						<cfset ArrayAppend(
        
          							LOCAL.Return[ LOCAL.RowIndex ],
        
          							LOCAL.Token
        
          							) />
        
          					</cfif>
        
          				<cfelse>
        
          					<!---
        
          						We are not dealing with a qualified
        
          						field (even though we are using field
        
          						qualifiers). Just add this token value
        
          						as the next value in the row.
        
          					--->
        
          					<cfset ArrayAppend(
        
          						LOCAL.Return[ LOCAL.RowIndex ],
        
          						LOCAL.Token
        
          						) />
        
          				</cfif>
        
          			</cfif>
        
          			<!---
        
          				As a sort of catch-all, let's remove that
        
          				{QUALIFIER} constant that we may have thrown
        
          				into a field value. Do NOT use the FieldIndex
        
          				value as this might be a corrupt value at
        
          				this point in the token iteration.
        
          			--->
        
          			<cfset LOCAL.Return[ LOCAL.RowIndex ][ ArrayLen( LOCAL.Return[ LOCAL.RowIndex ] ) ] = Replace(
        
          				LOCAL.Return[ LOCAL.RowIndex ][ ArrayLen( LOCAL.Return[ LOCAL.RowIndex ] ) ],
        
          				"{QUALIFIER}",
        
          				ARGUMENTS.Qualifier,
        
          				"ALL"
        
          				) />
        
          		<cfelse>
        
          			<!---
        
          				Since we don't have a qualifier, just use the
        
          				current raw token as the actual value. We are
        
          				NOT going to have to worry about building values
        
          				across tokens.
        
          			--->
        
          			<cfset ArrayAppend(
        
          				LOCAL.Return[ LOCAL.RowIndex ],
        
          				LOCAL.Token
        
          				) />
        
          		</cfif>
        
          		<!---
        
          			Check to see if we have a next delimiter and if we
        
          			do, is it going to start a new row? Be cautious that
        
          			we are NOT in the middle of building a value. If we
        
          			are building a value then the line delimiter is an
        
          			embedded value and should not percipitate a new row.
        
          		--->
        
          		<cfif (
        
          			(NOT LOCAL.IsInValue) AND
        
          			(LOCAL.TokenIndex LT ArrayLen( LOCAL.Tokens )) AND
        
          			(LOCAL.Delimiters[ LOCAL.TokenIndex ] EQ LOCAL.LineDelimiter)
        
          			)>
        
          			<!---
        
          				The next token is indicating that we are about
        
          				start a new row. Add a new array to the parent
        
          				and increment the row counter.
        
          			--->
        
          			<cfset ArrayAppend(
        
          				LOCAL.Return,
        
          				ArrayNew( 1 )
        
          				) />
        
          			<!--- Increment row index to point to next row. --->
        
          			<cfset LOCAL.RowIndex = (LOCAL.RowIndex + 1) />
        
          		</cfif>
        
          	</cfloop>
        
          	<!--- Return the resultant array of arrays. --->
        
          	<cfreturn LOCAL.Return />
        
          </cffunction>

view raw code-1.cfm hosted with ❤ by GitHub

To demonstrate that this works with both simple values and with embedded delimiters (both field and record), let's build a CSV value that has it all:

  
          <!---
        
          	Build the CSV string value. This value should contain
        
          	both functional and character-literal delimiters.
        
          --->
        
          <cfsavecontent variable="strCSV">
        
          Name,Nickname,Best Asset
        
          Sarah,"""Stubbs""",Butt
        
          Ashley,"Value with
        
          embedded line break",Smile
        
          Heather
        
          "Kat",",",
        
          </cfsavecontent>

view raw code-2.cfm hosted with ❤ by GitHub

Things to notice in the above:

Not all fields need to be qualified.
Stubbs has embedded field qualifiers (quotes).
The third record has an embedded record delimiter (line break).
The fourth record only has one value.
The fifth record has an embedded field delimiter (comma).

Now, let's take that CSV and pass it to the ColdFusion UDF, CSVToArray():

  
          <!---
        
          	Send the CSV value to the UDF. Be sure to trim the value
        
          	(leading and trailing spaces). We are going to leave the
        
          	default delimiter (,) and qualifier (").
        
          --->
        
          <cfset arrResult = CSVToArray(
        
          	CSV = strCSV.Trim()
        
          	) />
        
          <!--- Dump out the resultant array. --->
        
          <cfdump
        
          	var="#arrResult#"
        
          	label="CSV Results Array"
        
          	/>

view raw code-3.cfm hosted with ❤ by GitHub

This gives us the following CFDump output:

Parse CSV Values With Embedded Delimiters And Qualifiers

Notice that all the embedded values where properly handled! How sweet-ass-sweet is that? Also notice that the results can handle variable-length records (just like an actual Excel file can). Who says all of our records have to have the same number of values?

I am not sure how well this will scale. Once we get the array of delimiters and break the CSV value up into the raw token array, the time to process should increase linearly with the size of the CSV data. I do know that this has GOT to be more preformant that dealing with individual character values. At the very least, this gives me something I can integrate with my POI Utility ColdFusion component. It can always be optimized later.

The only little "bug" that I will fix at one point is that if there are NO CSV values, the UDF still returns an array consisting of one empty array (which could be considered a single CSV value that is empty). Ideally, this should return an empty results array. Oh well (easy fix, but I don't have time to worry about that).

Want to use code from this post? Check out the license.

Short link: https://bennadel.com/498

Reader Comments

Christopher Vigliotti (hibiscusroto) Jul 27, 2007 at 8:55 PM

7 Comments

I used this code a while ago and neglected to thank you...thanks!

Ben Nadel Jul 28, 2007 at 5:33 PM

16,058 Comments

No worries dude. Glad to be able to help.

easterro Aug 8, 2007 at 3:09 PM

1 Comments

Cut, Paste, and it worked perfectly. A solid, COMPREHENSIVE solution to a relatively common task. Thank you!

Ben Nadel Aug 8, 2007 at 3:36 PM

16,058 Comments

@Easterro,

My pleasure brother!

Morgan Sep 26, 2007 at 4:28 PM

1 Comments

Great work!

I really love it when you find stuff like this :-)
I'm a newbie when it comes to programming (started with Actionscript in Flash...) and I really appriciate well documented solutions like this one.

By the way, what is "normal" in terms of parsing-time per row? Say we have 5000 rows with 10 columns each. I test ran the code with something like this and it took around 75s.

//Morgan

Ben Nadel Sep 26, 2007 at 4:32 PM

16,058 Comments

@Morgan,

Glad you like the solution, or at least the documentation. As far as what is "normal" for processing time, I can't really say. I have never really tested this on a large file. Certainly, to deal with the embedded qualifiers and delimiters, there is a lot more parsing going on since it has to has to go through almost character by character.

Maybe someone else can speak to this? My solution might not be the best for large scale operations.

Ben Nadel Sep 27, 2007 at 2:34 PM

16,058 Comments

@Morgan,

I just re-wrote the entire CSV parsing algorithm. It's a fraction of the code and lightening fast. I just parsed 50,000 records in 11 seconds. I just had a Eureka moment working on an unrelated problem. I will try to post a blog entry on it tonight or tomorrow morning and will keep you updated!

Ben Nadel Sep 28, 2007 at 7:26 AM

16,058 Comments

Here is a much faster, shorter algorithm for parsing CSV data:

www.bennadel.com/index.cfm?dax=blog:976.view

Wayne Gregori Feb 19, 2014 at 11:42 PM

1 Comments

Hi Ben,

When I use the function it does not recognize the LF character as a new record... it works fine with CF LF. Can you help?

Thanks,
Wayne

Philip Kaplan Aug 29, 2014 at 4:25 AM

1 Comments

I found Ben's script to be too slow when processing very large CSV files.

I ended up writing a CSV to TSV (tab-separated) script that preserves commas in text-qualified (double-quoted) strings.

It's super fast.

If anyone wants it, it's here:
https://gist.github.com/pud/84f660b321a350024466

Oh my chickens, this post is old!

Hit me up on Twitter if you want to discuss it further.

	<cffunction
	name="CSVToArray"
	access="public"
	returntype="array"
	output="false"
	hint="Converts the given CSV string to an array of arrays.">

	<!--- Define arguments. --->
	<cfargument
	name="CSV"
	type="string"
	required="true"
	hint="This is the CSV string that will be manipulated."
	/>

	<cfargument
	name="Delimiter"
	type="string"
	required="false"
	default=","
	hint="This is the delimiter that will separate the fields within the CSV value."
	/>

	<cfargument
	name="Qualifier"
	type="string"
	required="false"
	default=""""
	hint="This is the qualifier that will wrap around fields that have special characters embeded."
	/>


	<!--- Define the local scope. --->
	<cfset var LOCAL = StructNew() />


	<!---
	When accepting delimiters, we only want to use the first
	character that we were passed. This is different than
	standard ColdFusion, but I am trying to make this as
	easy as possible.
	--->
	<cfset ARGUMENTS.Delimiter = Left( ARGUMENTS.Delimiter, 1 ) />

	<!---
	When accepting the qualifier, we only want to accept the
	first character returned. Is is possible that there is
	no qualifier being used. In that case, we can just store
	the empty string (leave as-is).
	--->
	<cfif Len( ARGUMENTS.Qualifier )>

	<cfset ARGUMENTS.Qualifier = Left( ARGUMENTS.Qualifier, 1 ) />

	</cfif>


	<!---
	Set a variable to handle the new line. This will be the
	character that acts as the record delimiter.
	--->
	<cfset LOCAL.LineDelimiter = Chr( 10 ) />


	<!---
	We want to standardize the line breaks in our CSV value.
	A "line break" might be a return followed by a feed or
	just a line feed. We want to standardize it so that it
	is just a line feed. That way, it is easy to check
	for later (and it is a single character which makes our
	life 1000 times nicer).
	--->
	<cfset ARGUMENTS.CSV = ARGUMENTS.CSV.ReplaceAll(
	"\r?\n",
	LOCAL.LineDelimiter
	) />


	<!---
	Let's get an array of delimiters. We will need this when
	we are going throuth the tokens and building up field
	values. To do this, we are going to strip out all
	characters that are NOT delimiters and then get the
	character array of the string. This should put each
	delimiter at it's own index.
	--->
	<cfset LOCAL.Delimiters = ARGUMENTS.CSV.ReplaceAll(
	"[^\#ARGUMENTS.Delimiter#\#LOCAL.LineDelimiter#]+",
	""
	)

	<!---
	Get character array of delimiters. This will put
	each found delimiter in its own index (that should
	correspond to the tokens).
	--->
	.ToCharArray()
	/>


	<!---
	Add a blank space to the beginning of every theoretical
	field. This will help in make sure that ColdFusion /
	Java does not skip over any fields simply because they
	do not have a value. We just have to be sure to strip
	out this space later on.

	First, add a space to the beginning of the string.
	--->
	<cfset ARGUMENTS.CSV = (" " & ARGUMENTS.CSV) />

	<!--- Now add the space to each field. --->
	<cfset ARGUMENTS.CSV = ARGUMENTS.CSV.ReplaceAll(
	"([\#ARGUMENTS.Delimiter#\#LOCAL.LineDelimiter#]{1})",
	"$1 "
	) />


	<!---
	Break the CSV value up into raw tokens. Going forward,
	some of these tokens may be merged, but doing it this
	way will help us iterate over them. When splitting the
	string, add a space to each token first to ensure that
	the split works properly.

	BE CAREFUL! Splitting a string into an array using the
	Java String::Split method does not create a COLDFUSION
	ARRAY. You cannot alter this array once it has been
	created. It can merely be referenced (read only).

	We are splitting the CSV value based on the BOTH the
	field delimiter and the line delimiter. We will handle
	this later as we build values (this is why we created
	the array of delimiters above).
	--->
	<cfset LOCAL.Tokens = ARGUMENTS.CSV.Split(
	"[\#ARGUMENTS.Delimiter#\#LOCAL.LineDelimiter#]{1}"
	) />


	<!---
	Set up the default return array. This will be a full
	array of arrays, but for now, just create the parent
	array with no indexes.
	--->
	<cfset LOCAL.Return = ArrayNew( 1 ) />

	<!---
	Create a new active row. Even if we don't end up adding
	any values to this row, it is going to make our lives
	more smiple to have it in existence.
	--->
	<cfset ArrayAppend(
	LOCAL.Return,
	ArrayNew( 1 )
	) />

	<!---
	Set up the row index. THis is the row to which we are
	actively adding value.
	--->
	<cfset LOCAL.RowIndex = 1 />


	<!---
	Set the default flag for wether or not we are in the
	middle of building a value across raw tokens.
	--->
	<cfset LOCAL.IsInValue = false />


	<!---
	Loop over the raw tokens to start building values. We
	have no sense of any row delimiters yet. Those will
	have to be checked for as we are building up each value.
	--->
	<cfloop
	index="LOCAL.TokenIndex"
	from="1"
	to="#ArrayLen( LOCAL.Tokens )#"
	step="1">


	<!---
	Get the current field index. This is the current
	index of the array to which we might be appending
	values (for a multi-token value).
	--->
	<cfset LOCAL.FieldIndex = ArrayLen(
	LOCAL.Return[ LOCAL.RowIndex ]
	) />

	<!---
	Get the next token. Trim off the first character
	which is the empty string that we added to ensure
	proper splitting.
	--->
	<cfset LOCAL.Token = LOCAL.Tokens[ LOCAL.TokenIndex ].ReplaceFirst(
	"^.{1}",
	""
	) />


	<!---
	Check to see if we have a field qualifier. If we do,
	then we might have to build the value across
	multiple fields. If we do not, then the raw tokens
	should line up perfectly with the real tokens.
	--->
	<cfif Len( ARGUMENTS.Qualifier )>


	<!---
	Check to see if we are currently building a
	field value that has been split up among
	different delimiters.
	--->
	<cfif LOCAL.IsInValue>


	<!---
	ASSERT: Since we are in the middle of
	building up a value across tokens, we can
	assume that our parent FOR loop has already
	executed at least once. Therefore, we can
	assume that we have a previous token value
	ALREADY in the row value array and that we
	have access to a previous delimiter (in
	our delimiter array).
	--->

	<!---
	Since we are in the middle of building a
	value, we replace out double qualifiers with
	a constant. We don't care about the first
	qualifier as it can ONLY be an escaped
	qualifier (not a field qualifier).
	--->
	<cfset LOCAL.Token = LOCAL.Token.ReplaceAll(
	"\#ARGUMENTS.Qualifier#{2}",
	"{QUALIFIER}"
	) />


	<!---
	Add the token to the value we are building.
	While this is not easy to read, add it
	directly to the results array as this will
	allow us to forget about it later. Be sure
	to add the PREVIOUS delimiter since it is
	actually an embedded delimiter character
	(part of the whole field value).
	--->
	<cfset LOCAL.Return[ LOCAL.RowIndex ][ LOCAL.FieldIndex ] = (
	LOCAL.Return[ LOCAL.RowIndex ][ LOCAL.FieldIndex ] &
	LOCAL.Delimiters[ LOCAL.TokenIndex - 1 ] &
	LOCAL.Token
	) />


	<!---
	Now that we have removed the possibly
	escaped qualifiers, let's check to see if
	this field is ending a multi-token
	qualified value (its last character is a
	field qualifier).
	--->
	<cfif (Right( LOCAL.Token, 1 ) EQ ARGUMENTS.Qualifier)>

	<!---
	Wooohoo! We have reached the end of a
	qualified value. We can complete this
	value and move onto the next field.
	Remove the trailing quote.

	Remember, we have already added to token
	to the results array so we must now
	manipulate the results array directly.
	Any changes made to LOCAL.Token at this
	point will not affect the results.
	--->
	<cfset LOCAL.Return[ LOCAL.RowIndex ][ LOCAL.FieldIndex ] = LOCAL.Return[ LOCAL.RowIndex ][ LOCAL.FieldIndex ].ReplaceFirst( ".{1}$", "" ) />

	<!---
	Set the flag to indicate that we are no
	longer building a field value across
	tokens.
	--->
	<cfset LOCAL.IsInValue = false />

	</cfif>


	<cfelse>


	<!---
	We are NOT in the middle of building a field
	value which means that we have to be careful
	of a few special token cases:

	1. The field is qualified on both ends.
	2. The field is qualified on the start end.
	--->

	<!---
	Check to see if the beginning of the field
	is qualified. If that is the case then either
	this field is starting a multi-token value OR
	this field has a completely qualified value.
	--->
	<cfif (Left( LOCAL.Token, 1 ) EQ ARGUMENTS.Qualifier)>


	<!---
	Delete the first character of the token.
	This is the field qualifier and we do
	NOT want to include it in the final value.
	--->
	<cfset LOCAL.Token = LOCAL.Token.ReplaceFirst(
	"^.{1}",
	""
	) />

	<!---
	Remove all double qualifiers so that we
	can test to see if the field has a
	closing qualifier.
	--->
	<cfset LOCAL.Token = LOCAL.Token.ReplaceAll(
	"\#ARGUMENTS.Qualifier#{2}",
	"{QUALIFIER}"
	) />

	<!---
	Check to see if this field is a
	self-closer. If the first character is a
	qualifier (already established) and the
	last character is also a qualifier (what
	we are about to test for), then this
	token is a fully qualified value.
	--->
	<cfif (Right( LOCAL.Token, 1 ) EQ ARGUMENTS.Qualifier)>

	<!---
	This token is fully qualified.
	Remove the end field qualifier and
	append it to the row data.
	--->
	<cfset ArrayAppend(
	LOCAL.Return[ LOCAL.RowIndex ],
	LOCAL.Token.ReplaceFirst(
	".{1}$",
	""
	)
	) />

	<cfelse>

	<!---
	This token is not fully qualified
	(but the first character was a
	qualifier). We are buildling a value
	up across differen tokens. Set the
	flag for building the value.
	--->
	<cfset LOCAL.IsInValue = true />

	<!--- Add this token to the row. --->
	<cfset ArrayAppend(
	LOCAL.Return[ LOCAL.RowIndex ],
	LOCAL.Token
	) />

	</cfif>


	<cfelse>


	<!---
	We are not dealing with a qualified
	field (even though we are using field
	qualifiers). Just add this token value
	as the next value in the row.
	--->
	<cfset ArrayAppend(
	LOCAL.Return[ LOCAL.RowIndex ],
	LOCAL.Token
	) />


	</cfif>


	</cfif>


	<!---
	As a sort of catch-all, let's remove that
	{QUALIFIER} constant that we may have thrown
	into a field value. Do NOT use the FieldIndex
	value as this might be a corrupt value at
	this point in the token iteration.
	--->
	<cfset LOCAL.Return[ LOCAL.RowIndex ][ ArrayLen( LOCAL.Return[ LOCAL.RowIndex ] ) ] = Replace(
	LOCAL.Return[ LOCAL.RowIndex ][ ArrayLen( LOCAL.Return[ LOCAL.RowIndex ] ) ],
	"{QUALIFIER}",
	ARGUMENTS.Qualifier,
	"ALL"
	) />


	<cfelse>


	<!---
	Since we don't have a qualifier, just use the
	current raw token as the actual value. We are
	NOT going to have to worry about building values
	across tokens.
	--->
	<cfset ArrayAppend(
	LOCAL.Return[ LOCAL.RowIndex ],
	LOCAL.Token
	) />


	</cfif>



	<!---
	Check to see if we have a next delimiter and if we
	do, is it going to start a new row? Be cautious that
	we are NOT in the middle of building a value. If we
	are building a value then the line delimiter is an
	embedded value and should not percipitate a new row.
	--->
	<cfif (
	(NOT LOCAL.IsInValue) AND
	(LOCAL.TokenIndex LT ArrayLen( LOCAL.Tokens )) AND
	(LOCAL.Delimiters[ LOCAL.TokenIndex ] EQ LOCAL.LineDelimiter)
	)>

	<!---
	The next token is indicating that we are about
	start a new row. Add a new array to the parent
	and increment the row counter.
	--->
	<cfset ArrayAppend(
	LOCAL.Return,
	ArrayNew( 1 )
	) />

	<!--- Increment row index to point to next row. --->
	<cfset LOCAL.RowIndex = (LOCAL.RowIndex + 1) />

	</cfif>

	</cfloop>


	<!--- Return the resultant array of arrays. --->
	<cfreturn LOCAL.Return />

	</cffunction>

	<!---
	Build the CSV string value. This value should contain
	both functional and character-literal delimiters.
	--->
	<cfsavecontent variable="strCSV">
	Name,Nickname,Best Asset
	Sarah,"""Stubbs""",Butt
	Ashley,"Value with
	embedded line break",Smile
	Heather
	"Kat",",",
	</cfsavecontent>

	<!---
	Send the CSV value to the UDF. Be sure to trim the value
	(leading and trailing spaces). We are going to leave the
	default delimiter (,) and qualifier (").
	--->
	<cfset arrResult = CSVToArray(
	CSV = strCSV.Trim()
	) />

	<!--- Dump out the resultant array. --->
	<cfdump
	var="#arrResult#"
	label="CSV Results Array"
	/>