Text Nodes Do Not Always Exist In A ColdFusion XML Document

Posted May 22, 2008 at 8:13 AM by Ben Nadel

Tags: ColdFusion

Yesterday, I was working on an audit tracker that stored updates in an XML document when I came across an XML behavior that I didn't know about. Apparently, not all element nodes in an XML document have to have a nested text node. Now, when you think about this, this makes some sense; however, when you look at a ColdFusion XML document, it can certainly be confusing. To examine this, let's create a simple ColdFusion XML document:

  • <!--- Create a ColdFusion XML document. --->
  • <cfxml variable="xmlGirl">
  •  
  • <girl>
  • <name>Hayden Panettiere</name>
  • <age>18</age>
  • <height></height>
  • <weight></weight>
  • <description>
  • Hayden played Claire, the Cheerleader, on the hit
  • Fox television show, Heroes.
  • </description>
  • </girl>
  •  
  • </cfxml>
  •  
  • <!--- Dump out XML document. --->
  • <cfdump
  • var="#xmlGirl#"
  • label="Girl: Hayden Panettiere"
  • />

Here, our Hayden Panettiere girl XML object has a number of nested fields. Of these fields, some have nested text values and some don't. However, even though that is the case, here is what the CFDump of the XML looks like:


 
 
 

 
ColdFusion XML Document That Has Some Text Nodes And Some Non-Text Nodes  
 
 
 

Notice that even though some element nodes have text and other don't, from a ColdFusion XML Document Object Model (DOM) standpoint, they all have XmlText values; some of them just happen to be empty strings. This kind of structure might lead you to believe that all element nodes have a text node element inside of them, and, in fact, this is what I used to think. As it turns out, though, this is not true - the XmlText attribute in the ColdFusion XML DOM has nothing to do with whether or not a text node actually exists.

To prove this, let's use XmlSearch() and XPath to select all nodes in the ColdFusion XML document that have a nested text node. We can do this by using the predicate [ text() ]. This predicate merely checks for existence and is not concerned with actual value:

  • <!---
  • Select all nodes from anywhere in the ColdFusion XML
  • document that have a nested text node.
  • --->
  • <cfset arrNodes = XmlSearch(
  • xmlGirl,
  • "//*[ text() ]"
  • ) />
  •  
  • <!--- Output names of nodes. --->
  • <cfloop
  • index="xmlNode"
  • array="#arrNodes#">
  •  
  • #xmlNode.XmlName#<br />
  •  
  • </cfloop>

After selecting all nodes that have a nested text node and outputting those node names, we get the following list:

  • girl
  • name
  • age
  • description

Notice that nodes Height and Weight are not getting selected. This is because they have no text node. And, if you look up at my original XML, you will see that matter of factly, the Height and Weight nodes open and close with no text data in between. Because there is no text data, there is no text node; so, while the ColdFusion XML DOM has XmlText values for these nodes, realize that they are not actually parents of anything.

While this might not seem like such a problem, this can cause things to be a little kinky when you actually need to query an XML document based on text values. A non-existing text node is very much like a NULL value in SQL; it's an "unknown" value. And, because it's an unknown value, you can't compare data do it. This goes for both testing equality as well as inequality. To demonstrate this, let's get all nodes whose text value is either equal to or not equal to, "Blam":

  • <!---
  • Select all nodes from anywhere in the CodlFusion XML
  • document if their text value equals "Blam" or does
  • NOT equal "Blam".
  • --->
  • <cfset arrNodes = XmlSearch(
  • xmlGirl,
  • "//*[ (text() = 'Blam') or (text() != 'Blam') ]"
  • ) />
  •  
  • <!--- Output names of nodes. --->
  • <cfloop
  • index="xmlNode"
  • array="#arrNodes#">
  •  
  • #xmlNode.XmlName#<br />
  •  
  • </cfloop>

Instinctively, you might think that this will return all nodes of the XML, right? I mean, anyone who's taken math knows that something and NOT something should return the "universe". However, just as with SQL and NULL values, because some of our text nodes don't exist, they cannot result in a known comparison whether that be a test of equality or inequality. And, in fact, when we output the selected node names:

  • girl
  • name
  • age
  • description

... we see that, again, neither the Height or Weight element nodes were selected. Imagine trying to select all nodes whose text value was NOT something. If you didn't realize how this worked, you might spend a heck of a lot of time banging your head against a wall trying to figure out why only some of the expected nodes were being selected.

If you work with a lot of XML, you probably already know this; but, if you have only worked with the ColdFusion XML Document Object Model (DOM), it may not be immediately obvious that the existence of an XmlText attribute does not mean that there is a corresponding text node in the DOM. I know that I didn't realize this, and it probably took me a good 30 minutes to figure out what the heck was going on.




Reader Comments

May 22, 2008 at 9:49 AM // reply »
79 Comments

Ben - Thanks for these posts on working with XML. I have just started doing for work with XML files and I know that I will be glad to have this type of information further down the road when I am troubleshooting issues with datasets that have tens of thousands of records or more.


May 22, 2008 at 9:53 AM // reply »
11,238 Comments

@Jason,

No problem my man. I can tell you right now, though, that XML documents that have enormous amounts of data are no fun to deal with :) Parsing is not that fast and people tell me that ColdFusion can crash if the XML parsing eats up all the memory.

That is what I hear from other people - I have never actually had to deal with such large files.


May 22, 2008 at 10:07 AM // reply »
9 Comments

Thanks Ben.
I'm refactoring an application and the Table of Contents is coming from an XML file. I've seen the code and I'm dreading to start working on it.

It's good to know this before hand so I don't pull my hair out when working on the project.

çB^]\..


May 22, 2008 at 10:10 AM // reply »
11,238 Comments

@Fernando,

No problem my man. I do love me some XML and XPath, even some XSLT. If you run into any problems, drop me a line.


May 22, 2008 at 10:58 AM // reply »
66 Comments

@Ben,
If you're trying to check for "empty" "Text Nodes", you could alter your XPath statement to something like this:

<cfset arrNodes = XmlSearch(xmlGirl, "//*[ boolean(text()) = false ]") />


May 22, 2008 at 11:01 AM // reply »
66 Comments

@Ben,
I forgot to mention my point ... my point is the XmlText Node *does* exist, but as you mentioned before, it's just empty.


May 22, 2008 at 11:12 AM // reply »
11,238 Comments

@Steve,

Thanks man. I have to do a more thorough exploration of the XPath functions that are actually supported in ColdFusion 8. I have tested a few of them and it seems to be really hit or miss.


May 22, 2008 at 12:34 PM // reply »
66 Comments

@Ben,
No problem. I blogged about this at:
http://www.stephenwithington.com/blog/index.cfm/2008/5/22/Text-Nodes-DO-Always-Exist-in-a-ColdFusion-XML-Document

And actually, I updated the code to check for empty text() to:

<cfset arrNodes = XmlSearch(xmlGirl, '//*[ (boolean(text()) = 0) ]') />

It seems this is the proper syntax for false.


May 24, 2008 at 4:23 AM // reply »
132 Comments

You seem to be confused. An empty string evaluates to false, so it doesn't match those nodes.

The XPath spec explains what is true and what is false in more detail:

http://www.w3.org/TR/xpath#function-boolean


May 24, 2008 at 4:36 AM // reply »
132 Comments

Ah ha. It is me that seems to be mistaken! CF is providing an xmlText, but there is no empty string as far as XPath is concerned. I should read the spec better next time before I comment.

text() actually returns a node set with all the text nodes of the element, and if there are no text nodes it returns an empty node set, which when compared to other things returns false.

So it looks like "//*[text() or not(node())]" would do what you want.


Post A Comment

Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.

Please review the following issues:

Author Name:


Author Email:

Author Website:

Comment:

Supported HTML tags for formatting: <strong>bold</strong>   <em>italic</em>   <code>code</code>







  • Help Wanted - Find Your Next ColdFusion Job
Ben Nadel's Company - Epicenter Consulting Recent Blog Comments
May 17, 2013 at 7:42 PM
HashKeyCopier - An AngularJS Utility Class For Merging Cached And Live Data
Ben - thanks so much for posting these Angular articles and findings, they've been a huge help towards learning one of the more 'complex' JavaScript frameworks out there (IMO). I have been using Angu ... read »
May 16, 2013 at 5:01 PM
UPDATE: Parsing CSV Data Files In ColdFusion With csvToArray()
Your code was the closest thing I've found to obtaining some direction for converting ISO fields to values that CF can translate properly. Thank you for posting! ... read »
May 15, 2013 at 10:37 PM
Very Simple Pusher And ColdFusion Powered Chat
hi id making plz easy ... read »
May 15, 2013 at 6:07 PM
Making SOAP Web Service Requests With ColdFusion And CFHTTP
Ben, you once again saved my bacon at work. Thank you, thank you, thank you! ... read »
May 15, 2013 at 4:15 PM
What If All User Interface (UI) Data Came In Reports?
@Josh, Thanks! @Ben, I definitely recommend the David West book "Object Thinking" I've been quoting from. It goes deeply into the philosophy and history of OO programming. His breadth ... read »
May 15, 2013 at 11:36 AM
Ask Ben: Print Part Of A Web Page With jQuery
I found this helpfull when you need to keep (refresh) the original parent page after closing the iframe child print dialog (Hoping you're not using a form at this time so it won't submit again): On ... read »
May 14, 2013 at 7:13 PM
What If All User Interface (UI) Data Came In Reports?
@Jonah, If there's any books you'd recommend on the subject of domain modelling, I'd love to hear it. I just downloaded the free PDF of "Domain Driven Design Quickly". Figured I'd give it ... read »
May 14, 2013 at 6:57 PM
The UX Of Prototyping: Low-Fidelity Is The New High-Fidelity
@Phillip, I'm not sure I follow what you mean? Are you saying that you looked at the list of widgets provided by the jQuery UI and let that be your style guide? ... read »
InVision App - Prototyping Made Beautiful With Prototyping Tools