Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at Scotch On The Rocks (SOTR) 2011 (Edinburgh) with:

How To Move XML Data Into A Database Using ColdFusion

By Ben Nadel on
Tags: ColdFusion

This blog post was inspired by an "Ask Ben" question that never made it to the blog. A reader had an XML document and understood how to read in XML files and how to run queries using CFQuery, but wasn't sure quite how to connect the two actions together. As with many problems, there are large number of solutions available. Some of them don't even involve ColdFusion (such as importing the XML document directly into the database). Even within ColdFusion, there are a number of ways to navigate and retrieve data from an XML document. As such, this is a question that requires a somewhat robust answer.

In the following demo, I am taking a manually constructed XML object and inserting it into the database using the three primary methods of XML interaction:

  1. XML searching using xmlSearch() and XPATH.
  2. XML traversal using native XML elements.
  3. XML traversal using ColdFusion's pseudo named-node collections.

The examples below might seem a bit odd because I am purposely sticking to a single method per example. In reality, you're probably going to get the best code by using a combination of the following approaches. That said, let's take a look at the code:

  • <!---
  • Create an XML document. This could be something that
  • is uploaded, read from a file, or created manually
  • (as we are doing).
  • --->
  • <cfxml variable="data">
  •  
  • <records>
  •  
  • <record>
  • <name>Tricia</name>
  • <hair>Brunette</hair>
  • </record>
  •  
  • <record>
  • <name>Joanna</name>
  • <hair>Brunette</hair>
  • </record>
  •  
  • <record>
  • <name>Eva</name>
  • <hair>Blonde</hair>
  • </record>
  •  
  • </records>
  •  
  • </cfxml>
  •  
  •  
  • <!---
  • When gathering data from an XML document, there are many ways
  • to do it. You can loop over them via ColdFusion's pseudo node
  • collections or, access the children using XML notation, or
  • search for them (or use a combination of the above). Since the
  • avilability of these methods depends in part on what version
  • of ColdFusion you have, let's look at each methodology.
  • --->
  •  
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  •  
  • <!---
  • Method 1: XmlSearch().
  • In this method, I am going to search for the "record" nodes.
  • This will return an array of Record nodes.
  • --->
  •  
  • <!--- Get record nodes. --->
  • <cfset recordNodes = xmlSearch( data, "/records/record" ) />
  •  
  • <!---
  • Loop over the nodes so that we can insert each one into
  • the database.
  • --->
  • <cfloop
  • index="recordNode"
  • array="#recordNodes#">
  •  
  • <!---
  • Because XML notation is somewhat wordy, I sometimes like
  • to create a short-hand to the target value before the
  • query so that I can keep the query very readible.
  •  
  • Because we are using XmlSearch() in this demo, I'm going
  • to use it here to gather the text values from the nodes.
  • This isn't the "right" approach, but I'll use it here to
  • show all concepts.
  • --->
  • <cfset valueNodes = xmlSearch( recordNode, "./*/text()" ) />
  •  
  • <!---
  • Because xmlSearch() returns the nodes in the same order
  • that they appear in the XML document (which we assume is
  • uniform), we can get the values based on their order.
  • --->
  • <cfset girlName = valueNodes[ 1 ].xmlValue />
  •  
  • <!---
  • You can also use xmlSeach() to search by the value node
  • by its name. I will get the hair value in this way.
  •  
  • Because xmlSearch() returns an array, we have to store an
  • intermediary collection value.
  • --->
  • <cfset hairNodes = xmlSearch( recordNode, "./hair/text()" ) />
  • <cfset hairColor = hairNodes[ 1 ].xmlValue />
  •  
  • <!--- Insert the values into the database. --->
  • <cfquery name="insert" datasource="ben">
  • INSERT INTO xml_girl
  • (
  • name,
  • hair
  • ) VALUES (
  • <cfqueryparam value="#girlName#" cfsqltype="cf_sql_varchar" />,
  • <cfqueryparam value="#hairColor#" cfsqltype="cf_sql_varchar" />
  • );
  • </cfquery>
  •  
  • </cfloop>
  •  
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  •  
  • <!---
  • Method 2: Native XML DOM.
  • In this method, I am going to walk over the XML DOM (Document
  • Object Model) using native XML navigation (specifically things
  • like XmlChildren and XmlText).
  • --->
  •  
  • <!---
  • We know the structure of the XML document so we know that the
  • record nodes are the direct children of the root node. There-
  • fore, we can loop over the first set of children.
  • --->
  • <cfloop
  • index="recordNode"
  • array="#data.xmlRoot.xmlChildren#">
  •  
  • <!---
  • Just as before, we are going to be getting short hand
  • references to your XML values. However, this time, rather
  • than using xmlSearch(), we are simply going to refer to
  • the child nodes by the xmlChildren collection and the
  • appropriate index.
  • --->
  • <cfset girlName = recordNode.xmlChildren[ 1 ].xmlText />
  • <cfset hairColor = recordNode.xmlChildren[ 2 ].xmlText />
  •  
  • <!--- Insert the values into the database. --->
  • <cfquery name="insert" datasource="ben">
  • INSERT INTO xml_girl
  • (
  • name,
  • hair
  • ) VALUES (
  • <cfqueryparam value="#girlName#" cfsqltype="cf_sql_varchar" />,
  • <cfqueryparam value="#hairColor#" cfsqltype="cf_sql_varchar" />
  • );
  • </cfquery>
  •  
  • </cfloop>
  •  
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  •  
  • <!---
  • Method 3: ColdFusion Pseudo XML Collections.
  • As a convenience, ColdFusion provides pseudo collections of
  • XML nodes such that they can be referenced by name. In this
  • method, we are going to walk over these collections.
  • --->
  •  
  • <!---
  • We know the structure of the XML document so we know that
  • the record nodes are the direct children of the records node.
  • Therefore, we can use the named path to find the target
  • record nodes.
  • --->
  • <cfloop
  • index="recordIndex"
  • from="1"
  • to="#arrayLen( data.records.record )#"
  • step="1">
  •  
  • <!---
  • Since we can't loop over these ColdFusion psuedo XML
  • node collection using an array-loop, we have to use an
  • index loop. As such, I generally like to get a short-
  • hand reference to the node at the current index.
  • --->
  • <cfset recordNode = data.records.record[ recordIndex ] />
  •  
  • <!---
  • Now that we have the current record node, I am going to
  • grab the value of the given value nodes.
  •  
  • NOTE: Even when working with the pseudo collections, we
  • STILL HAVE TO use the xmlText attribute (otherwise
  • ColdFusion will convert the node to a string, giving it
  • a doctype).
  •  
  • NOTE: We can exclude the index of the value node (ex.
  • recordNode.name[ 1 ]) because we only want the first one
  • which is what not include the index defaults to. You can
  • use the [ 1 ], if you want, as I have done with hair.
  • --->
  • <cfset girlName = recordNode.name.xmlText />
  • <cfset hairColor = recordNode.hair[ 1 ].xmlText />
  •  
  • <!--- Insert the values into the database. --->
  • <cfquery name="insert" datasource="ben">
  • INSERT INTO xml_girl
  • (
  • name,
  • hair
  • ) VALUES (
  • <cfqueryparam value="#girlName#" cfsqltype="cf_sql_varchar" />,
  • <cfqueryparam value="#hairColor#" cfsqltype="cf_sql_varchar" />
  • );
  • </cfquery>
  •  
  • </cfloop>

I'm not really going to explain what's going on above because the code has been extensively commented. But please, feel free to ask me any questions in the comments below.

When we run the example above, we get the following database table configuration:

 
 
 
 
 
 
Moving XML Data From An XML File Into A Database Using ColdFusion. 
 
 
 

As you can see, each of the three different XML traversal and data retrieval methods successfully pulled the data out of the XML document and inserted it into the database. All of the examples assume that the column data is coming out of XML text values. I have found this to be the majority of use cases; however, sometimes part or all of the record data is delivered as attributes on an XML node. In those cases, you simply have to route your requests through the XmlAttributes collection:

  • <!--- Get the name attribute value. --->
  • <cfset girlName = recordNode.xmlAttributes.name />

This is not an exhaustive explanation of how you can move data from XML files into a database using ColdFusion - this is just meant to demonstrate the breadth of possibilities. And, like I said above, rarely will you ever stick exclusively to one method. The best, most readable code is generally created from the combination of these methods.




Reader Comments

I wish there were more girl names involved when I'm importing data into my databases.

It's neat to see these differing methods segregated this way. You're quite right that "In reality, you're probably going to get the best code by using a combination of the following approaches."

@Josh

Yes, anything is possible. That even falls within "reasonable". Though I don't know how practical.

You can pull the names of the final XML leaves, and write a create table from them. Though I'd want to have very well structures XML, and a well defined structure before attempting that.

In fact I think the CF toll Datamgr does that:
http://bryantwebconsulting.com/docs/datamgr/what-is-datamgr.cfm
(It has it's own flavor of XML to define a db, but it can use it to create a table, though I don't know if it imports data)

@Josh,

Absolutely. All you would have to do is add a CREATE TABLE query before you start inserting records. I generally use SQL manager (Navicat for MySQL) to do this, so the Syntax is not immediately available in my head.

Most of the xml navigation technique examples I have seen assume 'uniform xml structure'.

I would like to see examples where xml is assymetrical (eg, a UPS xml docs, etc.). In other words, derive accurate table data from an xml doc where the same parent node comprises different sets of child nodes in different parts of the xml doc.

Bill

Great post, i have followed these steps and its working, but the data is being imported with the xml declarations around the data. Is there an easy way to remove it? i.e.

<?xml version="1.0" encoding="UTF-8"?>
<StoreName>Data being imported</StoreName>.

I only need the "Data being imported" section. I have seen that you can use REReplace with Regex to remove the information but im importing quite alot of data and will take me a long time to write out all the REReplace statements one by one. Thanks.

@Chris,

Honestly, I don't know how to do it without generating a doc type at the top. I have only ever been able to remove that with RegEx. But, it's not as bad as you might think:

reReplace( data, "^<\?[^>]+>", "", "one" )

This will work with all the XML data values, so it's not like you need to custom create them for each batch of XML.

Using you examples I can get the teamnames into a database. But pulling my hair out trying to get the parent cat id and position so IE; I need

GrandChampion 1 Qua
ReserveChampion 1 Swamp Water Boys
Overall 1 Qua
Overall 2 Swamp Water Boys
Overall 3 Jack Old South.

<root>
<cat id="GrandChampion">
<teamname>Quau</teamname>
</cat>
<cat id="ReserveChampion">
<teamname>Swamp Water Boys</teamname>
</cat>
<cat id="Overall">
<teamname>Quau</teamname>
<teamname>Swamp Water Boys</teamname>
<teamname>Jacks Old South</teamname>
<teamname>Swamp Boys BBQ</teamname>
<teamname>Parrothead Smokers</teamname>
<teamname>Munchees Smokehouse</teamname>
<teamname>Wicked Que</teamname>
<teamname>B S Pitmeisters</teamname>
</cat>
</root>

Any help is greatly appreciated. I can't spare the hair.

@Michael,

To get attribute in general, you need to use the XmlAttributes collection. So, for example, to get the first CAT ID, you could use:

xmlVar.root.cat[ 1 ].xmlAttributes.id

That said, what you probably want to do is loop over the CAT nodes; then, as you loop over them, you'll want an inner-loop that loops over the teamates. If you are not sure what I mean, I can help you out.

Hi Ben, great post and exactly what I was looking for.

But I have a question that steps a little beyond this tutorial: What if I want to schedule an XML feed in to my database table nightly? Do I drop the existing table from the previous night and create a new one, or is there a way to append to the existing DB table?

My XML is very small, so I am guessing that dropping and then creating a new is the right answer.

@Dave,

If you want to remove all the data from the data table, you should be able to use the TRUNCATE command. This essentially deletes all the records and resets the auto-incrementers. It's basically like dropping and recreating, but with a bit less work.

@Ben, thanks. About an hour after I wrote that question I found my answer and my code is working great thanks to you. Also it is nice to have conformation from you that I did it right!

Thanks for this article! It totally helped me figure out how to deal with a USPS postal rate API return xml which was giving me grief today.

Once again, you da man! :)

Hi Ben
Im trying to insert big xml file data from wholesale server directly to my webshop database
Which way is best for this operation and how to do it? thanx in advance
example of xml file (little part of xml):
http://c-bit.hr/Laptopi.xml

Ben, I've taken your Method Three above and combined it with CFPOP to try and save (JULIE locating emailed XML attachments) to a MySQL database table.

- The code works great when run, and the "static" XML data file is saved as a new record in my table.

- But how can I alter the code to work with dynamic XML file name(s)

- Currently I have hard-coded the <cffile action="read" file="C:\locates\xmlLocateAttachments\JULIEA_JULIEV_20120213133643312.xml" because I'm not sure how to tell it to just look at the folder, rather than specific filenames!

- I need to make this a dynamic process, where the CFPOP code/section/page runs every 30 min, and if a new email has arrived, then save the XML attachment(s) to the folder, and finally have your Method Three code run that copies each XML attachment record to the database table?

ps: There is always just one attachment, always in XML, with identical fields. I plan to have CFPOP delete the email once it has been processed..

Thanks, jlig

Here is my code: http://cerberus.clearwave.com/jerry/Untitled-2.pdf

I've been struggling to translate the code above to something I can use to parse an xml file created by Adobe Captivate and insert the data in an access database. I can make the example above work, but I don't understand the structure well. Can any of you help? Currently the core is the main field I need to pull right now. My XML doc looks like this:

  • http://ies-pub.southeasterntech.edu/data.xml

Thanks for any help!

Ben, in your example, your XML doc looks like this:

<records>
<record>
<name>Tricia</name>
<hair>Brunette</hair>
</record>
</records>

My XML doc looks like this format:

<records>
<record>
<name value="Tricia"/>
<hair value="Brunette"/>
</record>
</records>

How can I work around this? Your examples won't read my file!

Found my solution! Here's what it looks like, reading an xml file and inserting the values into an access database. BTW, this is the resultdata of an Adobe Captivate Quiz.

<!--- Convert file to XML document object --->
<cffile action="read" file="C:\Inetpub_Public\trainlocal\Orientation\test.xml" variable="myxml">
<cfset mydoc = XmlParse(myxml)>

<!--- Begin Extract Course Data --->
<!--- Begin Extract Course Data --->
<!--- Begin Extract Course Data --->

<cfset CompanyName=#mydoc.Course.CompanyName.XmlAttributes.Value#>
<cfset CourseName=#mydoc.Course.CourseName.XmlAttributes.Value#>
<cfset LearnerName=#mydoc.Course.LearnerName.XmlAttributes.Value#>
<cfset LearnerID=#mydoc.Course.LearnerID.XmlAttributes.Value#>
<cfset LessonName=#mydoc.Course.LessonName.XmlAttributes.Value#>
<cfset QuizAttempts=#mydoc.Course.QuizAttempts.XmlAttributes.Value#>
<cfset TotalQuestions=#mydoc.Course.TotalQuestions.XmlAttributes.Value#>
<cfset DepartmentName=#mydoc.Course.DepartmentName.XmlAttributes.Value#>

<!--- End Extract CourseData --->
<!--- End Extract CourseData --->
<!--- End Extract CourseData --->

<cfquery datasource="myQuiz" name="Import_Course_Data">
insert into Course
(
CompanyName,
DepartmentName,
CourseName,
LearnerName,
LearnerID,
LessonName,
QuizAttempts,
TotalQuestions
)
VALUES
(
'#CompanyName#',
'#DepartmentName#',
'#CourseName#',
'#LearnerName#',
'#LearnerID#',
'#LessonName#',
#QuizAttempts#,
#TotalQuestions#
);
</cfquery>

Woo hoo!