Ask Ben: Handling Optional XML Nodes In WhitePages.com API Response

By Ben Nadel

Published 2008-12-29 in Ask Ben, ColdFusion — Comments (4)

I'm a CF / Flex developer who has been reading some of your XML entries. Thanks, in advance, for always posting great solutions. I have one for you that is driving me crazy, and it's probably minor. I'm developing an app using the whitepages.com api. If I want to collect specific data in each returned record, it breaks in my loop b/c the XML is missing a node. Example:

If I search a name and it returns 3 records, the first and last have phonenumbers, but the 2nd one doesn't and throws an error. How can I test if the XMLName exists, which is wp:phonenumbers, when looping through results?

<cfset requestURL = XMLparse(URL toget XML)> gets the results in XML format.
<cfset numOfLoops = XmlSearch(requestURL,"//@wp:totalavailable") />
<cfloop from="1" to="#numOfLoops[1].XMLvalue#" index="i">
<cfoutput>
#requestURL.wp.listings.listing[i].phonenumbers#<br /><br />
</cfoutput>
</cfloop>

Right now, you are dealing with the meta data about the results. By that, I mean you are using the "totalavailable" attribute to tell you how to loop over the results. This is a fine method; but, I prefer to let the physical XML document determine how the output is used. I am not saying that this method is any different - I am just saying I am more comfortable trusting the structure rather than the attributes.

First, I am going to build up an fake, abridged fake WhitePages.com API result set based on their documentation:

<!--- Build a fake WhitePages.com API response. --->
<cfxml variable="xmlWhitePages">

	<?xml version="1.0" encoding="UTF-8" ?>
	<wp:wp xmlns:wp="http://api.whitepages.com/schema/">
		<wp:result wp:type="success" />
		<wp:meta />
		<wp:listings>

			<!--- Listing One. --->
			<wp:listing>
				<wp:people>
					<wp:person wp:rank="primary">
						<wp:firstname>Mike</wp:firstname>
						<wp:lastname>Smith</wp:lastname>
					</wp:person>
				</wp:people>
				<wp:phonenumbers>
					<wp:phone wp:type="home" wp:rank="primary">
						<wp:fullphone>(206) 973-0000</wp:fullphone>
						<wp:areacode>206</wp:areacode>
						<wp:exchange>973</wp:exchange>
						<wp:linenumber>0000</wp:linenumber>
					</wp:phone>
				</wp:phonenumbers>
				<wp:address />
				<wp:geodata />
				<wp:listingmeta />
			</wp:listing>

			<!--- Listing Two. --->
			<wp:listing>
				<wp:people>
					<wp:person wp:rank="primary">
						<wp:firstname>Sarah</wp:firstname>
						<wp:lastname>Vivenzio</wp:lastname>
					</wp:person>
				</wp:people>
				<wp:address />
				<wp:geodata />
				<wp:listingmeta />
			</wp:listing>

			<!--- Listing Three. --->
			<wp:listing>
				<wp:people>
					<wp:person wp:rank="primary">
						<wp:firstname>Julia</wp:firstname>
						<wp:lastname>Niles</wp:lastname>
					</wp:person>
				</wp:people>
				<wp:phonenumbers>
					<wp:phone wp:type="home" wp:rank="primary">
						<wp:fullphone>(555) 123-0300</wp:fullphone>
						<wp:areacode>555</wp:areacode>
						<wp:exchange>123</wp:exchange>
						<wp:linenumber>0300</wp:linenumber>
					</wp:phone>
				</wp:phonenumbers>
				<wp:address />
				<wp:geodata />
				<wp:listingmeta />
			</wp:listing>
		</wp:listings>
	</wp:wp>

</cfxml>

In this XML document, as in yours, the second wp:listing node does not contain the wp:phonenumbers data node.

Now, when it comes to unpredictable XML documents, there are two ways to handle the variation. One way is to treat the ColdFusion XML document as a collection of pseudo, name-based sets. The other is to treat it as a query able document. I prefer the latter, but we will look at the former first.

In order to make XML documents a little bit more accessible, ColdFusion collects like-named XML siblings into pseudo-sets of nodes. This allows us to refer to XML documents as if they were collections of nested Structs. Because of this, we can use the struct method StructKeyExists() to see if a given pseudo-node set exists as the child of a given XML node:

<!---
	Query for all the listing nodes (the people who have been
	returned in our WhitePages.com API search).
--->
<cfset arrListingNodes = XmlSearch(
	xmlWhitePages,
	"//wp:listing"
	) />


<!---
	Loop over our listing nodes an output the name and phone
	numbers for each record.
--->
<cfloop
	index="xmlListing"
	array="#arrListingNodes#">

	<!--- Output name (always exists). --->
	#xmlListing[ "wp:people" ][ "wp:person" ][ "wp:firstname" ].XmlText#
	#xmlListing[ "wp:people" ][ "wp:person" ][ "wp:lastname" ].XmlText#
	<br />

	<!--- Check to see if there is a phone numbers node. --->
	<cfif StructKeyExists( xmlListing, "wp:phonenumbers" )>

		#xmlListing[ "wp:phonenumbers" ][ "wp:phone" ][ "wp:fullphone" ].XmlText#

	<cfelse>

		<em>No number available</em>

	</cfif>

	<br />
	<br />

</cfloop>

Notice that once we have our collection of listing nodes, we are using struct/array name-based notation to navigate down do the nodes we want to access. Now, just as we would with a standard ColdFusion struct, we are testing for the existence of the wp:phonenumbers node by seeing if the pseudo-node set key exists in the parent (the current listing node). If the node exists, then we are outputing its text value.

When we run the above code, we get the following output:

Mike Smith
(206) 973-0000

Sarah Vivenzio
No number available

Julia Niles
(555) 123-0300

This works just fine. But, for some reason, I feel more comfortable viewing a ColdFusion XML document as a query able document rather than a collection nested structures. That is not to say that I totally disregard the above method; rather, I combine the above method with a more XmlSearch()-based method. I use XmlSearch() to locate optional child nodes. I find the code to be longer, but a bit more manageable:

<!---
	Query for all the listing nodes (the people who have been
	returned in our WhitePages.com API search).
--->
<cfset arrListingNodes = XmlSearch(
	xmlWhitePages,
	"//wp:listing"
	) />


<!---
	Loop over our listing nodes an output the name and phone
	numbers for each record.
--->
<cfloop
	index="xmlListing"
	array="#arrListingNodes#">

	<!---
		Now that we are in the context of each listing node
		(as defined by the xmlListing reference), we are going
		to perform node-relative XML searches to find the points
		of data that we want to output.
	--->

	<!--- Search for the person node. --->
	<cfset arrPersonNodes = XmlSearch(
		xmlListing,
		"./wp:people/wp:person/"
		) />

	<!--- Search for phone numbers node. --->
	<cfset arrPhoneNumberNodes = XmlSearch(
		xmlListing,
		"./wp:phonenumbers/wp:phone/wp:fullphone/"
		) />


	<!--- Check to see if a name was found. --->
	<cfif ArrayLen( arrPersonNodes )>

		<!--- Output first and last names. --->
		#arrPersonNodes[ 1 ][ "wp:firstname" ].XmlText#
		#arrPersonNodes[ 1 ][ "wp:lastname" ].XmlText#
		<br />

	</cfif>


	<!--- Check to see if phone numbers was found. --->
	<cfif ArrayLen( arrPhoneNumberNodes )>

		#arrPhoneNumberNodes[ 1 ].XmlText#

	<cfelse>

		<em>No number available</em>

	</cfif>

	<br />
	<br />

</cfloop>

When we run this code, we get the exact same output. Again, I am not saying that this method is better than the first; it just so happens that I am personally a little more comfortable with it. Something about using XmlSearch() on a ColdFusion XML document makes me a little bit more comfortable than using StructKeyExists().

Now, notice that in all of my XML output, I am explicitly using the XmlText property of the referenced XML nodes:

arrPhoneNumberNodes[ 1 ].XmlText

... rather than:

arrPhoneNumberNodes[ 1 ]

From the user's point of view, it won't make much of a difference - both output the text of the XML node. However, when you output the XML node without referencing its XmlText property, what you are actually doing is asking ColdFusion to automagically convert the given XML node to a string:

ToString( arrPhoneNumberNodes[ 1 ] )

When you do this, ColdFusion actually returns the following:

<?xml version="1.0" encoding="UTF-8"?>
<wp:fullphone xmlns:wp="http://api.whitepages.com/schema/"
>(555) 123-0300</wp:fullphone>

As you can see, it doesn't just output the node value - it takes the node, treats it as a complete XML document, and generates the string-based equivalent. And, since you certainly don't want that hanging out in your source code, make sure to use the .XmlText property whenever outputting data.

Want to use code from this post? Check out the license.

Short link: https://bennadel.com/1443

Reader Comments

Eric Dec 29, 2008 at 12:15 PM

7 Comments

While looping, you could also use the xmlChildPos function. i.e.
<cfif xmlChildPos(someArray[i],"someNode", 1) GT 0></cfif>

Ben Nadel Dec 29, 2008 at 12:24 PM

16,125 Comments

@CV,

Good point. I don't believe I have ever used the XmlChildPos() method. I looked at it once when I was trying to get the sibling index based on on a given node reference; but it turns out, that's not what it does at all :)

Mark Hosny Dec 29, 2008 at 2:06 PM

1 Comments

Hey Ben,

Great job and THANKS! I was using XMLSearch, which I prefer, but my problem was not using array option in cfloop; that's what was giving me the error.

I also agree with using the XML structure instead of the meta data, especially since some nodes do not even show up as we saw in this scenario.

Ben Nadel Dec 29, 2008 at 2:13 PM

16,125 Comments

@Mark,

Glad to help my man. And, please don't get me wrong - no one technique is any better than another. My strategy will definitely change depending on source of the XML data. If it is something that I create and use internally and it's small, I will definitely use the named-pseudo-sets.

Just trying to demonstrate different options.

Oh my chickens, this post is old!

Hit me up on LinkedIn if you want to discuss it further.