Ask Ben: Working With Inconsistent XML

Posted May 23, 2008 at 10:49 AM by Ben Nadel

Tags: ColdFusion, Ask Ben

About your post: Testing For The Absence Of A Text Node Using XmlSearch() And XPath... I could not post any code so I am writing you here. Useful info. Here's another one for ya. Say your looping through some XML you got via cfhttp that has, at times, *inconsistent data*. See below.

<girl>
. . . <name>Hayden Panettiere</name>
. . . <age>18</age>
. . . <height></height>
. . . <weight></weight>
. . . <description>
. . . . . . Hayden played Claire, the Cheerleader,
. . . . . . on the hit Fox television show, Heroes.
. . . </description>
</girl>
<girl>
. . . <name>Marisa Miller</name>
. . . <age>26</age>
. . . <description>
. . . . . . Marisa played this girl of my dreams
. . . . . . the last time I slept.
. . . </description>
</girl>

When looping through the girl(s), what do you do when the height or weight nodes simply do not exist like under Marisa Miller? I've used the xmlChildPos() function. I'd like to know what you might use.

XmlChildPos() is a good way to go; I actually only learned about that method when I was researching XML delete functionality. The documentation was really confusing on what it even did. That said, I think one of the tricks to working with XML in ColdFusion is to realize that the ColdFusion XML document is really flexible and diverse as to how it can be accessed and addressed. It really is pretty amazing when you think about it.

For starters, child nodes of a given node can be access as a single array using Node.XmlChildren. Or, they can be accessed as "pseudo" arrays using named addresses such as Node.ChildNode. Furthermore, you can think of these pseudo arrays as belong to a set of sets that is like a pseudo structure. Because of this we can actually access and delete values using ColdFusion array and struct methods. Also, if you want to access the text of a node, you can use Node.XmlText, or you can simply output the node, #Node#. Very easy!

I'm probably not explaining it all well, and I am sure that I am unsure on some of the short-cuts. Actually, that would make a cool blog post in and of itself. That being said, let's take a quick look at how we might handle the inconsistent data above:

  • <!---
  • Create XML girls. If this were actually coming from a CFHTTP
  • rquest as in the question above, you could just have used
  • something like XmlParse( Trim( CFHTTP.FileContent ) ).
  • --->
  • <cfxml variable="xmlGirls">
  •  
  • <girls>
  • <girl>
  • <name>Hayden Panettiere</name>
  • <age>18</age>
  • <height>5'2"</height>
  • <weight>125 lbs.</weight>
  • <description>
  • Hayden played Claire, the Cheerleader, on the
  • hit Fox television show, Heroes.
  • </description>
  • </girl>
  • <girl>
  • <name>Marisa Miller</name>
  • <age>26</age>
  • <description>
  • Marisa played this girl of my dreams the last
  • time I slept.
  • </description>
  • </girl>
  • </girls>
  •  
  • </cfxml>
  •  
  •  
  • <!--- Loop over girls. --->
  • <cfloop
  • index="xmlGirl"
  • array="#xmlGirls.XmlRoot.XmlChildren#">
  •  
  • Name: #xmlGirl.name#<br />
  • Age: #xmlGirl.age#<br />
  •  
  • <!--- Check for height in "pseudo child struct". --->
  • <cfif StructKeyExists( xmlGirl, "height" )>
  • Height: #xmlGirl.height#<br />
  • </cfif>
  •  
  • <!--- Check for weight in "pseudo child struct". --->
  • <cfif StructKeyExists( xmlGirl, "weight" )>
  • Height: #xmlGirl.weight#<br />
  • </cfif>
  •  
  • Description: #xmlGirl.description#<br />
  • <br />
  •  
  • </cfloop>

When we run this, we get the following output:

Name: Hayden Panettiere
Age: 18
Height: 5'2"
Height: 125 lbs.
Description: Hayden played Claire, the Cheerleader, on the hit Fox television show, Heroes.

Name: Marisa Miller
Age: 26
Description: Marisa played this girl of my dreams the last time I slept.

Notice that to see if the inconsistent nodes exist, we are merely checking their named existence in the parent node's "pseudo struct set," for lack of a better name. Then, to access all of these values, we are using the name-chain short-hand notation. In fact, we are using several short-hand notations together. Our line of text here:

  • #xmlGirl.name#

... could be rewritten as such:

  • #xmlGirl.name[ 1 ].XmlText#

... which could also be rewritten as such:

  • #xmlGirl.XmlChildren[ 1 ].XmlText#

When you understand the short-hand notation that can be used, you can really find quick ways to check for node existence. Hope that helps.



Reader Comments

May 23, 2008 at 12:18 PM // reply »
3 Comments

This is exactly what I do when working with XML. Most of the times I use StructKeyExists(xml, node). We never know when someone is going to touch the xml, delete a node, break the application and make us the guilty ones :)


May 23, 2008 at 2:01 PM // reply »
7 Comments

Ben, using your example, how would you re-write this, a real-world example using the eBay.com api? Would you need to know the complete xml hierarchy in order for this to work? Sorry, I can't post full code directly to your blog...

cfhttp url="someEbayApiUrl" method="GET" result="xmlFeed"

cfset theArray = xmlSearch(xmlFeed, "//*[local-name() = 'Item']")

cfloop index="i" from="1" to="#arrayLen(theArray)#"

cfif xmlChildPos(theArray[i],"PostalCode", 1) GT 0
cfset postalcode = theArray[i].postalcode.xmlText
/cfif

/cfloop


May 25, 2008 at 10:27 AM // reply »
110 Comments

I know this isn't helpful in any way to the code above, but I just wanted to note that Heroes is shown on NBC, not FOX :)


May 25, 2008 at 11:44 AM // reply »
11,246 Comments

@Gareth,

Ooops :) Good catch.


May 26, 2008 at 2:32 PM // reply »
11,246 Comments

@Che,

I am not sure what you are asking. I would think that when you work with an API, you need to know the data structure that will be returned... otherwise, how can you possibly know what to do with it?


May 26, 2008 at 3:36 PM // reply »
9 Comments

I used to use
<cfif xmlsearch(xmlgirl,"/height") GT 0>
Height: #xmlGirl.height#<br />
</cfif>

Do you think StructKeyExists( xmlGirl, "height" )> is faster?


May 26, 2008 at 4:08 PM // reply »
11,246 Comments

@JusufD,

I would assume the StructKeyExists() would be faster, but I am sure both are fast enough.


May 28, 2008 at 6:02 AM // reply »
1 Comments

I'm not familiar with cold fusion. When I am dealing with inconsistent XML I use the XPath:

//cats/item[count > 0]


Nov 11, 2011 at 6:19 AM // reply »
3 Comments

Great post, as usual. If I have a CF related problem... usually I find a solution on your website!
Thanks Ben!


Post A Comment

Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.

Please review the following issues:

Author Name:


Author Email:

Author Website:

Comment:

Supported HTML tags for formatting: <strong>bold</strong>   <em>italic</em>   <code>code</code>







  • Help Wanted - Find Your Next ColdFusion Job
Ben Nadel's Company - Epicenter Consulting Recent Blog Comments
May 24, 2013 at 11:21 AM
Strange Interaction Between DeserializeJson(), ArrayContains(), And Database Values In ColdFusion
@WebManWalking, Ha ha, let's us never speak of justifying "##" notation again :P ... read »
May 24, 2013 at 11:18 AM
Strange Interaction Between DeserializeJson(), ArrayContains(), And Database Values In ColdFusion
@Ben, Ah, so it was indeed how I vaguely remembered it to be: A direct assignment value = users.id[ i ] causes value to retain the sticky datatype of the query column. Although unnecessary in ... read »
May 24, 2013 at 9:11 AM
Preventing Links In Standalone iPhone Applications From Opening In Mobile Safari
@Brandon, Hi, No, I haven't been able to do that. I have just kept it as it is. ... read »
May 23, 2013 at 9:52 PM
Preventing Links In Standalone iPhone Applications From Opening In Mobile Safari
@Muhmmadibn Did you figure out a solution to launching PDFs? I am running into the same issues myself. There is no way to close the PDF or go back once you launch it. Thanks in advance! ... read »
May 23, 2013 at 6:06 PM
The Girl Who Broke My Heart, And Made Me A Better Person
Good day,ladies and gentle men, my name is Dr AMADI the great spell caster in Africa, i have help so many people for different kind of problems,who say there is no solution to problems on earth, that ... read »
May 23, 2013 at 4:26 PM
ColdFusion QueryAppend( qOne, qTwo )
@Heather, Glad people are still getting value out of this! ... read »
May 23, 2013 at 3:49 PM
Strange Interaction Between DeserializeJson(), ArrayContains(), And Database Values In ColdFusion
@WebManWalking, I meant the code at the bottom (not the video). I did try to experiment with an intermediary variable, like: value = users.id[ i ]; arrayContains( userIDs, value ); ... but t ... read »
May 23, 2013 at 11:06 AM
Strange Interaction Between DeserializeJson(), ArrayContains(), And Database Values In ColdFusion
@Ben, Are you talking about As Number: YES As String: YES As Java: YES? If so, that's with 3 different ways of referencing the constant 1, not users.id[1]. Query object references(*) are what seem ... read »
InVision App - Prototyping Made Beautiful With Prototyping Tools