How To Move XML Data Into A Database Using ColdFusion

Posted July 9, 2009 at 9:14 PM

Tags: ColdFusion

This blog post was inspired by an "Ask Ben" question that never made it to the blog. A reader had an XML document and understood how to read in XML files and how to run queries using CFQuery, but wasn't sure quite how to connect the two actions together. As with many problems, there are large number of solutions available. Some of them don't even involve ColdFusion (such as importing the XML document directly into the database). Even within ColdFusion, there are a number of ways to navigate and retrieve data from an XML document. As such, this is a question that requires a somewhat robust answer.

In the following demo, I am taking a manually constructed XML object and inserting it into the database using the three primary methods of XML interaction:

  1. XML searching using xmlSearch() and XPATH.
  2. XML traversal using native XML elements.
  3. XML traversal using ColdFusion's pseudo named-node collections.

The examples below might seem a bit odd because I am purposely sticking to a single method per example. In reality, you're probably going to get the best code by using a combination of the following approaches. That said, let's take a look at the code:

 Launch code in new window » Download code as text file »

  • <!---
  • Create an XML document. This could be something that
  • is uploaded, read from a file, or created manually
  • (as we are doing).
  • --->
  • <cfxml variable="data">
  •  
  • <records>
  •  
  • <record>
  • <name>Tricia</name>
  • <hair>Brunette</hair>
  • </record>
  •  
  • <record>
  • <name>Joanna</name>
  • <hair>Brunette</hair>
  • </record>
  •  
  • <record>
  • <name>Eva</name>
  • <hair>Blonde</hair>
  • </record>
  •  
  • </records>
  •  
  • </cfxml>
  •  
  •  
  • <!---
  • When gathering data from an XML document, there are many ways
  • to do it. You can loop over them via ColdFusion's pseudo node
  • collections or, access the children using XML notation, or
  • search for them (or use a combination of the above). Since the
  • avilability of these methods depends in part on what version
  • of ColdFusion you have, let's look at each methodology.
  • --->
  •  
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  •  
  • <!---
  • Method 1: XmlSearch().
  • In this method, I am going to search for the "record" nodes.
  • This will return an array of Record nodes.
  • --->
  •  
  • <!--- Get record nodes. --->
  • <cfset recordNodes = xmlSearch( data, "/records/record" ) />
  •  
  • <!---
  • Loop over the nodes so that we can insert each one into
  • the database.
  • --->
  • <cfloop
  • index="recordNode"
  • array="#recordNodes#">
  •  
  • <!---
  • Because XML notation is somewhat wordy, I sometimes like
  • to create a short-hand to the target value before the
  • query so that I can keep the query very readible.
  •  
  • Because we are using XmlSearch() in this demo, I'm going
  • to use it here to gather the text values from the nodes.
  • This isn't the "right" approach, but I'll use it here to
  • show all concepts.
  • --->
  • <cfset valueNodes = xmlSearch( recordNode, "./*/text()" ) />
  •  
  • <!---
  • Because xmlSearch() returns the nodes in the same order
  • that they appear in the XML document (which we assume is
  • uniform), we can get the values based on their order.
  • --->
  • <cfset girlName = valueNodes[ 1 ].xmlValue />
  •  
  • <!---
  • You can also use xmlSeach() to search by the value node
  • by its name. I will get the hair value in this way.
  •  
  • Because xmlSearch() returns an array, we have to store an
  • intermediary collection value.
  • --->
  • <cfset hairNodes = xmlSearch( recordNode, "./hair/text()" ) />
  • <cfset hairColor = hairNodes[ 1 ].xmlValue />
  •  
  • <!--- Insert the values into the database. --->
  • <cfquery name="insert" datasource="ben">
  • INSERT INTO xml_girl
  • (
  • name,
  • hair
  • ) VALUES (
  • <cfqueryparam value="#girlName#" cfsqltype="cf_sql_varchar" />,
  • <cfqueryparam value="#hairColor#" cfsqltype="cf_sql_varchar" />
  • );
  • </cfquery>
  •  
  • </cfloop>
  •  
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  •  
  • <!---
  • Method 2: Native XML DOM.
  • In this method, I am going to walk over the XML DOM (Document
  • Object Model) using native XML navigation (specifically things
  • like XmlChildren and XmlText).
  • --->
  •  
  • <!---
  • We know the structure of the XML document so we know that the
  • record nodes are the direct children of the root node. There-
  • fore, we can loop over the first set of children.
  • --->
  • <cfloop
  • index="recordNode"
  • array="#data.xmlRoot.xmlChildren#">
  •  
  • <!---
  • Just as before, we are going to be getting short hand
  • references to your XML values. However, this time, rather
  • than using xmlSearch(), we are simply going to refer to
  • the child nodes by the xmlChildren collection and the
  • appropriate index.
  • --->
  • <cfset girlName = recordNode.xmlChildren[ 1 ].xmlText />
  • <cfset hairColor = recordNode.xmlChildren[ 2 ].xmlText />
  •  
  • <!--- Insert the values into the database. --->
  • <cfquery name="insert" datasource="ben">
  • INSERT INTO xml_girl
  • (
  • name,
  • hair
  • ) VALUES (
  • <cfqueryparam value="#girlName#" cfsqltype="cf_sql_varchar" />,
  • <cfqueryparam value="#hairColor#" cfsqltype="cf_sql_varchar" />
  • );
  • </cfquery>
  •  
  • </cfloop>
  •  
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  •  
  • <!---
  • Method 3: ColdFusion Pseudo XML Collections.
  • As a convenience, ColdFusion provides pseudo collections of
  • XML nodes such that they can be referenced by name. In this
  • method, we are going to walk over these collections.
  • --->
  •  
  • <!---
  • We know the structure of the XML document so we know that
  • the record nodes are the direct children of the records node.
  • Therefore, we can use the named path to find the target
  • record nodes.
  • --->
  • <cfloop
  • index="recordIndex"
  • from="1"
  • to="#arrayLen( data.records.record )#"
  • step="1">
  •  
  • <!---
  • Since we can't loop over these ColdFusion psuedo XML
  • node collection using an array-loop, we have to use an
  • index loop. As such, I generally like to get a short-
  • hand reference to the node at the current index.
  • --->
  • <cfset recordNode = data.records.record[ recordIndex ] />
  •  
  • <!---
  • Now that we have the current record node, I am going to
  • grab the value of the given value nodes.
  •  
  • NOTE: Even when working with the pseudo collections, we
  • STILL HAVE TO use the xmlText attribute (otherwise
  • ColdFusion will convert the node to a string, giving it
  • a doctype).
  •  
  • NOTE: We can exclude the index of the value node (ex.
  • recordNode.name[ 1 ]) because we only want the first one
  • which is what not include the index defaults to. You can
  • use the [ 1 ], if you want, as I have done with hair.
  • --->
  • <cfset girlName = recordNode.name.xmlText />
  • <cfset hairColor = recordNode.hair[ 1 ].xmlText />
  •  
  • <!--- Insert the values into the database. --->
  • <cfquery name="insert" datasource="ben">
  • INSERT INTO xml_girl
  • (
  • name,
  • hair
  • ) VALUES (
  • <cfqueryparam value="#girlName#" cfsqltype="cf_sql_varchar" />,
  • <cfqueryparam value="#hairColor#" cfsqltype="cf_sql_varchar" />
  • );
  • </cfquery>
  •  
  • </cfloop>

I'm not really going to explain what's going on above because the code has been extensively commented. But please, feel free to ask me any questions in the comments below.

When we run the example above, we get the following database table configuration:

 
 
 
 
 
 
Moving XML Data From An XML File Into A Database Using ColdFusion. 
 
 
 

As you can see, each of the three different XML traversal and data retrieval methods successfully pulled the data out of the XML document and inserted it into the database. All of the examples assume that the column data is coming out of XML text values. I have found this to be the majority of use cases; however, sometimes part or all of the record data is delivered as attributes on an XML node. In those cases, you simply have to route your requests through the XmlAttributes collection:

 Launch code in new window » Download code as text file »

  • <!--- Get the name attribute value. --->
  • <cfset girlName = recordNode.xmlAttributes.name />

This is not an exhaustive explanation of how you can move data from XML files into a database using ColdFusion - this is just meant to demonstrate the breadth of possibilities. And, like I said above, rarely will you ever stick exclusively to one method. The best, most readable code is generally created from the combination of these methods.

Download Code Snippet ZIP File

Post Comment  |  Ask Ben  |  Other Searches  |  Print Page





Reader Comments

Jul 9, 2009 at 10:24 PM // reply »
2 Comments

I wish there were more girl names involved when I'm importing data into my databases.

It's neat to see these differing methods segregated this way. You're quite right that "In reality, you're probably going to get the best code by using a combination of the following approaches."


Jul 10, 2009 at 9:34 AM // reply »
2 Comments

Is it possible to first create a database table from the unique xml nodes then populate the database table with the xml data?


Jul 10, 2009 at 10:02 AM // reply »
28 Comments

@Josh

Yes, anything is possible. That even falls within "reasonable". Though I don't know how practical.

You can pull the names of the final XML leaves, and write a create table from them. Though I'd want to have very well structures XML, and a well defined structure before attempting that.

In fact I think the CF toll Datamgr does that:
http://bryantwebconsulting.com/docs/datamgr/what-is-datamgr.cfm
(It has it's own flavor of XML to define a db, but it can use it to create a table, though I don't know if it imports data)


Jul 10, 2009 at 10:02 AM // reply »
7,572 Comments

@Josh,

Absolutely. All you would have to do is add a CREATE TABLE query before you start inserting records. I generally use SQL manager (Navicat for MySQL) to do this, so the Syntax is not immediately available in my head.


Jul 12, 2009 at 5:09 PM // reply »
2 Comments

Most of the xml navigation technique examples I have seen assume 'uniform xml structure'.

I would like to see examples where xml is assymetrical (eg, a UPS xml docs, etc.). In other words, derive accurate table data from an xml doc where the same parent node comprises different sets of child nodes in different parts of the xml doc.

Bill


Jul 21, 2009 at 7:59 AM // reply »
7,572 Comments

@Bill,

While there are number of ways to do this and so much of it depends on how the XML file differs and how that difference affects your processing, I have tried to put together some examples of dealing with inconsistent XML in ColdFusion:

http://www.bennadel.com/blog/1659-Working-With-Inconsistent-XML-Structures-In-ColdFusion.htm


tom
Sep 28, 2009 at 10:00 PM // reply »
2 Comments

Its very goood code, but i want get the data from DataBase and put into XMl using ColdFusion.


tom
Sep 28, 2009 at 10:00 PM // reply »
2 Comments

Its very goood code, but i want get the data from DataBase and put into XMl using ColdFusion.


Sep 29, 2009 at 8:09 AM // reply »
7,572 Comments

@Tom,

I am not sure I understand the question. Have you tried using the CFXML tag yet?


Sep 30, 2009 at 9:34 AM // reply »
1 Comments

Great post, i have followed these steps and its working, but the data is being imported with the xml declarations around the data. Is there an easy way to remove it? i.e.

<?xml version="1.0" encoding="UTF-8"?>
<StoreName>Data being imported</StoreName>.

I only need the "Data being imported" section. I have seen that you can use REReplace with Regex to remove the information but im importing quite alot of data and will take me a long time to write out all the REReplace statements one by one. Thanks.


Oct 1, 2009 at 8:06 AM // reply »
7,572 Comments

@Chris,

Honestly, I don't know how to do it without generating a doc type at the top. I have only ever been able to remove that with RegEx. But, it's not as bad as you might think:

reReplace( data, "^<\?[^>]+>", "", "one" )

This will work with all the XML data values, so it's not like you need to custom create them for each batch of XML.


Feb 6, 2010 at 3:19 AM // reply »
1 Comments

Using you examples I can get the teamnames into a database. But pulling my hair out trying to get the parent cat id and position so IE; I need

GrandChampion 1 Qua
ReserveChampion 1 Swamp Water Boys
Overall 1 Qua
Overall 2 Swamp Water Boys
Overall 3 Jack Old South.

<root>
<cat id="GrandChampion">
<teamname>Quau</teamname>
</cat>
<cat id="ReserveChampion">
<teamname>Swamp Water Boys</teamname>
</cat>
<cat id="Overall">
<teamname>Quau</teamname>
<teamname>Swamp Water Boys</teamname>
<teamname>Jacks Old South</teamname>
<teamname>Swamp Boys BBQ</teamname>
<teamname>Parrothead Smokers</teamname>
<teamname>Munchees Smokehouse</teamname>
<teamname>Wicked Que</teamname>
<teamname>B S Pitmeisters</teamname>
</cat>
</root>

Any help is greatly appreciated. I can't spare the hair.


Feb 6, 2010 at 5:42 PM // reply »
7,572 Comments

@Michael,

To get attribute in general, you need to use the XmlAttributes collection. So, for example, to get the first CAT ID, you could use:

xmlVar.root.cat[ 1 ].xmlAttributes.id

That said, what you probably want to do is loop over the CAT nodes; then, as you loop over them, you'll want an inner-loop that loops over the teamates. If you are not sure what I mean, I can help you out.


Post Comment  |  Ask Ben

Recent Blog Comments
Mar 21, 2010 at 6:32 AM
ColdFusion CFPOP - My First Look
Apologies... The field name in the db for C. is "BounceCode" It stores the code / message which is returned in the email. Sorry for the confusion. ... read »
Mar 21, 2010 at 6:29 AM
ColdFusion CFPOP - My First Look
@Jose Galdamez, Hi Ben and Jose 1st of all.. big thanks to Jose for his Skype chat a few weeks back. Your time was much appreciated. I have come up with a rather unelegant solution to my problem a ... read »
Mar 21, 2010 at 3:42 AM
A New Wrist Pain
Chiropractic treatment is one of the best methods for treating numerous health problems naturally. After years of experience being a chiropractor, I have found that it is a powerful way to solve many ... read »
Mar 20, 2010 at 12:07 PM
Drawing On The iPhone Canvas With jQuery And ColdFusion
Simply awesome. Saved my day. ... read »
Mar 20, 2010 at 9:00 AM
Building A Fixed-Position Bottom Menu Bar (ala FaceBook)
I would like to say thx for an easy way to create a bottom bar. I do have a ?. Is it possible to center the bar if i want to resize it to ex 85%. Regards Offenbach ... read »
Mar 19, 2010 at 7:26 PM
MySQL 3/4 - com.mysql.jdbc.Driver And allowMultiQueries=true
Thank you very much for this post. Adding allowMultiQueries="true" in context.xml didn't help until I added it to url as allowMultiQueries=true Good idea is to use prepared statements and it will he ... read »
Jim
Mar 19, 2010 at 4:49 PM
Nobody Puts Baby In The Corner!
Wow. This is like suddenly finding a support group for your secret shame. I'm not alone! I always liked this movie, even though it is extremely cheesy. I just wish Jennifer Grey hadn't gotten the ... read »
Mar 19, 2010 at 4:47 PM
Application.cfc OnRequest() Method Affects OnError() Arguments
@Jason and @Ben, I've been doing some CF9 refactoring on our systems and noticed an odd occurrence with onError as well. Found a way to work around my problem, but what I saw was... Background: Our ... read »