How To Move XML Data Into A Database Using ColdFusion

Posted July 9, 2009 at 9:14 PM

Tags: ColdFusion

This blog post was inspired by an "Ask Ben" question that never made it to the blog. A reader had an XML document and understood how to read in XML files and how to run queries using CFQuery, but wasn't sure quite how to connect the two actions together. As with many problems, there are large number of solutions available. Some of them don't even involve ColdFusion (such as importing the XML document directly into the database). Even within ColdFusion, there are a number of ways to navigate and retrieve data from an XML document. As such, this is a question that requires a somewhat robust answer.

In the following demo, I am taking a manually constructed XML object and inserting it into the database using the three primary methods of XML interaction:

  1. XML searching using xmlSearch() and XPATH.
  2. XML traversal using native XML elements.
  3. XML traversal using ColdFusion's pseudo named-node collections.

The examples below might seem a bit odd because I am purposely sticking to a single method per example. In reality, you're probably going to get the best code by using a combination of the following approaches. That said, let's take a look at the code:

 Launch code in new window » Download code as text file »

  • <!---
  • Create an XML document. This could be something that
  • is uploaded, read from a file, or created manually
  • (as we are doing).
  • --->
  • <cfxml variable="data">
  •  
  • <records>
  •  
  • <record>
  • <name>Tricia</name>
  • <hair>Brunette</hair>
  • </record>
  •  
  • <record>
  • <name>Joanna</name>
  • <hair>Brunette</hair>
  • </record>
  •  
  • <record>
  • <name>Eva</name>
  • <hair>Blonde</hair>
  • </record>
  •  
  • </records>
  •  
  • </cfxml>
  •  
  •  
  • <!---
  • When gathering data from an XML document, there are many ways
  • to do it. You can loop over them via ColdFusion's pseudo node
  • collections or, access the children using XML notation, or
  • search for them (or use a combination of the above). Since the
  • avilability of these methods depends in part on what version
  • of ColdFusion you have, let's look at each methodology.
  • --->
  •  
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  •  
  • <!---
  • Method 1: XmlSearch().
  • In this method, I am going to search for the "record" nodes.
  • This will return an array of Record nodes.
  • --->
  •  
  • <!--- Get record nodes. --->
  • <cfset recordNodes = xmlSearch( data, "/records/record" ) />
  •  
  • <!---
  • Loop over the nodes so that we can insert each one into
  • the database.
  • --->
  • <cfloop
  • index="recordNode"
  • array="#recordNodes#">
  •  
  • <!---
  • Because XML notation is somewhat wordy, I sometimes like
  • to create a short-hand to the target value before the
  • query so that I can keep the query very readible.
  •  
  • Because we are using XmlSearch() in this demo, I'm going
  • to use it here to gather the text values from the nodes.
  • This isn't the "right" approach, but I'll use it here to
  • show all concepts.
  • --->
  • <cfset valueNodes = xmlSearch( recordNode, "./*/text()" ) />
  •  
  • <!---
  • Because xmlSearch() returns the nodes in the same order
  • that they appear in the XML document (which we assume is
  • uniform), we can get the values based on their order.
  • --->
  • <cfset girlName = valueNodes[ 1 ].xmlValue />
  •  
  • <!---
  • You can also use xmlSeach() to search by the value node
  • by its name. I will get the hair value in this way.
  •  
  • Because xmlSearch() returns an array, we have to store an
  • intermediary collection value.
  • --->
  • <cfset hairNodes = xmlSearch( recordNode, "./hair/text()" ) />
  • <cfset hairColor = hairNodes[ 1 ].xmlValue />
  •  
  • <!--- Insert the values into the database. --->
  • <cfquery name="insert" datasource="ben">
  • INSERT INTO xml_girl
  • (
  • name,
  • hair
  • ) VALUES (
  • <cfqueryparam value="#girlName#" cfsqltype="cf_sql_varchar" />,
  • <cfqueryparam value="#hairColor#" cfsqltype="cf_sql_varchar" />
  • );
  • </cfquery>
  •  
  • </cfloop>
  •  
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  •  
  • <!---
  • Method 2: Native XML DOM.
  • In this method, I am going to walk over the XML DOM (Document
  • Object Model) using native XML navigation (specifically things
  • like XmlChildren and XmlText).
  • --->
  •  
  • <!---
  • We know the structure of the XML document so we know that the
  • record nodes are the direct children of the root node. There-
  • fore, we can loop over the first set of children.
  • --->
  • <cfloop
  • index="recordNode"
  • array="#data.xmlRoot.xmlChildren#">
  •  
  • <!---
  • Just as before, we are going to be getting short hand
  • references to your XML values. However, this time, rather
  • than using xmlSearch(), we are simply going to refer to
  • the child nodes by the xmlChildren collection and the
  • appropriate index.
  • --->
  • <cfset girlName = recordNode.xmlChildren[ 1 ].xmlText />
  • <cfset hairColor = recordNode.xmlChildren[ 2 ].xmlText />
  •  
  • <!--- Insert the values into the database. --->
  • <cfquery name="insert" datasource="ben">
  • INSERT INTO xml_girl
  • (
  • name,
  • hair
  • ) VALUES (
  • <cfqueryparam value="#girlName#" cfsqltype="cf_sql_varchar" />,
  • <cfqueryparam value="#hairColor#" cfsqltype="cf_sql_varchar" />
  • );
  • </cfquery>
  •  
  • </cfloop>
  •  
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  •  
  • <!---
  • Method 3: ColdFusion Pseudo XML Collections.
  • As a convenience, ColdFusion provides pseudo collections of
  • XML nodes such that they can be referenced by name. In this
  • method, we are going to walk over these collections.
  • --->
  •  
  • <!---
  • We know the structure of the XML document so we know that
  • the record nodes are the direct children of the records node.
  • Therefore, we can use the named path to find the target
  • record nodes.
  • --->
  • <cfloop
  • index="recordIndex"
  • from="1"
  • to="#arrayLen( data.records.record )#"
  • step="1">
  •  
  • <!---
  • Since we can't loop over these ColdFusion psuedo XML
  • node collection using an array-loop, we have to use an
  • index loop. As such, I generally like to get a short-
  • hand reference to the node at the current index.
  • --->
  • <cfset recordNode = data.records.record[ recordIndex ] />
  •  
  • <!---
  • Now that we have the current record node, I am going to
  • grab the value of the given value nodes.
  •  
  • NOTE: Even when working with the pseudo collections, we
  • STILL HAVE TO use the xmlText attribute (otherwise
  • ColdFusion will convert the node to a string, giving it
  • a doctype).
  •  
  • NOTE: We can exclude the index of the value node (ex.
  • recordNode.name[ 1 ]) because we only want the first one
  • which is what not include the index defaults to. You can
  • use the [ 1 ], if you want, as I have done with hair.
  • --->
  • <cfset girlName = recordNode.name.xmlText />
  • <cfset hairColor = recordNode.hair[ 1 ].xmlText />
  •  
  • <!--- Insert the values into the database. --->
  • <cfquery name="insert" datasource="ben">
  • INSERT INTO xml_girl
  • (
  • name,
  • hair
  • ) VALUES (
  • <cfqueryparam value="#girlName#" cfsqltype="cf_sql_varchar" />,
  • <cfqueryparam value="#hairColor#" cfsqltype="cf_sql_varchar" />
  • );
  • </cfquery>
  •  
  • </cfloop>

I'm not really going to explain what's going on above because the code has been extensively commented. But please, feel free to ask me any questions in the comments below.

When we run the example above, we get the following database table configuration:

 
 
 
 
 
 
Moving XML Data From An XML File Into A Database Using ColdFusion. 
 
 
 

As you can see, each of the three different XML traversal and data retrieval methods successfully pulled the data out of the XML document and inserted it into the database. All of the examples assume that the column data is coming out of XML text values. I have found this to be the majority of use cases; however, sometimes part or all of the record data is delivered as attributes on an XML node. In those cases, you simply have to route your requests through the XmlAttributes collection:

 Launch code in new window » Download code as text file »

  • <!--- Get the name attribute value. --->
  • <cfset girlName = recordNode.xmlAttributes.name />

This is not an exhaustive explanation of how you can move data from XML files into a database using ColdFusion - this is just meant to demonstrate the breadth of possibilities. And, like I said above, rarely will you ever stick exclusively to one method. The best, most readable code is generally created from the combination of these methods.

Download Code Snippet ZIP File

Post Comment  |  Ask Ben  |  Permalink  |  Other Searches  |  Print Page




Learning ColdFusion 9 - ColdFusion 9 tutorials, samples, examples, demos

Reader Comments

Jul 9, 2009 at 10:24 PM // reply »
2 Comments

I wish there were more girl names involved when I'm importing data into my databases.

It's neat to see these differing methods segregated this way. You're quite right that "In reality, you're probably going to get the best code by using a combination of the following approaches."


Jul 10, 2009 at 9:34 AM // reply »
2 Comments

Is it possible to first create a database table from the unique xml nodes then populate the database table with the xml data?


Tim Leach
Jul 10, 2009 at 10:02 AM // reply »
27 Comments

@Josh

Yes, anything is possible. That even falls within "reasonable". Though I don't know how practical.

You can pull the names of the final XML leaves, and write a create table from them. Though I'd want to have very well structures XML, and a well defined structure before attempting that.

In fact I think the CF toll Datamgr does that:
http://bryantwebconsulting.com/docs/datamgr/what-is-datamgr.cfm
(It has it's own flavor of XML to define a db, but it can use it to create a table, though I don't know if it imports data)


Jul 10, 2009 at 10:02 AM // reply »
6,371 Comments

@Josh,

Absolutely. All you would have to do is add a CREATE TABLE query before you start inserting records. I generally use SQL manager (Navicat for MySQL) to do this, so the Syntax is not immediately available in my head.


bill
Jul 12, 2009 at 5:09 PM // reply »
2 Comments

Most of the xml navigation technique examples I have seen assume 'uniform xml structure'.

I would like to see examples where xml is assymetrical (eg, a UPS xml docs, etc.). In other words, derive accurate table data from an xml doc where the same parent node comprises different sets of child nodes in different parts of the xml doc.

Bill


Jul 21, 2009 at 7:59 AM // reply »
6,371 Comments

@Bill,

While there are number of ways to do this and so much of it depends on how the XML file differs and how that difference affects your processing, I have tried to put together some examples of dealing with inconsistent XML in ColdFusion:

http://www.bennadel.com/blog/1659-Working-With-Inconsistent-XML-Structures-In-ColdFusion.htm


tom
Sep 28, 2009 at 10:00 PM // reply »
2 Comments

Its very goood code, but i want get the data from DataBase and put into XMl using ColdFusion.


tom
Sep 28, 2009 at 10:00 PM // reply »
2 Comments

Its very goood code, but i want get the data from DataBase and put into XMl using ColdFusion.


Sep 29, 2009 at 8:09 AM // reply »
6,371 Comments

@Tom,

I am not sure I understand the question. Have you tried using the CFXML tag yet?


Chris
Sep 30, 2009 at 9:34 AM // reply »
1 Comments

Great post, i have followed these steps and its working, but the data is being imported with the xml declarations around the data. Is there an easy way to remove it? i.e.

<?xml version="1.0" encoding="UTF-8"?>
<StoreName>Data being imported</StoreName>.

I only need the "Data being imported" section. I have seen that you can use REReplace with Regex to remove the information but im importing quite alot of data and will take me a long time to write out all the REReplace statements one by one. Thanks.


Oct 1, 2009 at 8:06 AM // reply »
6,371 Comments

@Chris,

Honestly, I don't know how to do it without generating a doc type at the top. I have only ever been able to remove that with RegEx. But, it's not as bad as you might think:

reReplace( data, "^<\?[^>]+>", "", "one" )

This will work with all the XML data values, so it's not like you need to custom create them for each batch of XML.


Post Comment  |  Ask Ben

Recent Blog Comments
Nov 7, 2009 at 5:53 PM
Ask Ben: Javascript String Replace Method
You can find here an advanced function that prepared with javascript replace function. This can make the first letters of words, sentences, lines and whatever you define automatically: http://www.m ... read »
Andrew Neely
Nov 7, 2009 at 4:56 PM
A Moment That Touched Me - The Fountainhead
Ben, Glad you enjoyed the podcast. Yeah, the Tank Riot guys can get really chatty during the episodes, but that's part of the charm of it for me. They've covered everything from Nichola Tesla to Cha ... read »
Nov 7, 2009 at 4:43 PM
Building A Fixed-Position Bottom Menu Bar (ala FaceBook)
Is it possible to make some more MenĂ¼`s ? ... read »
Jill
Nov 7, 2009 at 11:40 AM
How To Unformat Your Code (Like A Pro)
Derek, I think you might be right - sweet! Thanks for the link :) ... read »
Nov 7, 2009 at 11:25 AM
How To Unformat Your Code (Like A Pro)
I think it would be way easier to just use this http://www.logichammer.com/html-formatter/ He just released v3 and it rocks. ... read »
Jill
Nov 7, 2009 at 7:58 AM
How To Unformat Your Code (Like A Pro)
LMAO - this was pretty funny! I have to admit - I also love to reformat code so I can read it. My boss used to tell me to leave my OCD at home. Now I don't feel so bad after reading everyone else' ... read »
Nov 6, 2009 at 10:10 PM
How To Unformat Your Code (Like A Pro)
The timing of this post is just uncanny. I spent the last 15-20 minutes manually un-formatting my "Ben Nadel" style code within a CFC of mine. I was really digging the readability a few weeks ago, bu ... read »
Roe
Nov 6, 2009 at 5:11 PM
Passing Arrays By Reference In ColdFusion - SWEEET!
ArraySort also reorders the results of these java obj's ... read »