Earlier today, I posted about how ColdFusion custom tags executed much faster than XML parsing. To make the example more general, I was using XmlSearch() with XPath to get at the XML nodes. I did this because that way, the nature of the XML document could be more variable, just like the nature of ColdFusion custom tags. Tony Petruzzi suggested that removing the XmlSearch() would help a bit. I assumed it would, but at lunch (just now) decided to give it a go.
Here is the updated tag that parses the XML and creates a comma separated values (CSV) file. Notice that rather than using XmlSearch(), I am using the pseudo-array that ColdFusion makes available in XML documents when you refer to XML nodes by tag name:
<!--- Check to see which tag mode we are executing. ---> <cfswitch expression="#THISTAG.ExecutionMode#"> <cfcase value="Start"> <!--- Set the path to our output file. ---> <cfset THISTAG.FilePath = ExpandPath( "xml_data2.csv" ) /> </cfcase> <cfcase value="End"> <!--- Parse the XML that was generated in this tag. ---> <cfset THISTAG.XmlData = XmlParse( Trim( THISTAG.GeneratedContent ) ) /> <!--- Create a string buffer to hold intermediary data so we don't have to write to the file just yet. ---> <cfset THISTAG.Buffer = CreateObject( "java", "java.lang.StringBuffer" ).Init() /> <!--- Loop over rows using the pseudo-array that ColdFusion provides when referencing XML nodes by name. ---> <cfloop index="THISTAG.RowIndex" from="1" to="#ArrayLen( THISTAG.XmlData.data.row )#" step="1"> <!--- Get a reference to the current row. ---> <cfset THISTAG.XmlRow = THISTAG.XmlData.data.row[ THISTAG.RowIndex ] /> <!--- Loop over values using the pseudo-array that ColdFusion provides when referencing XML nodes by name. ---> <cfloop index="THISTAG.ValueIndex" from="1" to="#ArrayLen( THISTAG.XmlRow.value )#" step="1"> <!--- Get a reference to the current value. ---> <cfset THISTAG.XmlValue = THISTAG.XmlRow.value[ THISTAG.ValueIndex ] /> <!--- Add value to string buffer. Add a tab after each value (this will leave a tag at the end of every line, but I am worried about speed, not extra characters). ---> <cfset THISTAG.Buffer.Append( JavaCast( "string", ( THISTAG.XmlValue.XmlText & Chr( 9 ) )) ) /> </cfloop> <!--- Now that we added the values, add new line. ---> <cfset THISTAG.Buffer.Append( JavaCast( "string", (Chr( 13 ) & Chr( 10 )) ) ) /> </cfloop> <!--- Our string buffer should contain our CSV data. Now, let's write that to the output file. ---> <cffile action="write" file="#THISTAG.FilePath#" output="#THISTAG.Buffer.ToString()#" /> <!--- Reset the content. ---> <cfset THISTAG.GeneratedContent = "" /> </cfcase> </cfswitch>
The previous version of this used to run at just over 13 seconds. This new version that uses pseudo-xml-arrays runs in about 800 milliseconds!
When I first saw this result, I just assumed something was going wrong. I renamed the CSV file (xml_data2.csv) and ran it again. But sure enough, it ran in a little of 700 milliseconds and the new file (xml_data2.csv) contained all 1,000 rows of data.
Holy Cow! As it turns out, XML Parsing blows the pants off of ColdFusion custom tags when it comes to performance. Obviously, there is going to be an eventual tradeoff as the XML parsing has to be done in-memory, but for 1000 rows, this was INSANELY fast. Two things:
I am shocked at how slow XmlSearch() is! This is good information to know. It was the XmlSearch() alone that add 13 seconds to the processing time in the previous example.
I am a little surprised at how slow ColdFusion custom tags seem to be, comparatively. Over 5 seconds to do what XML parsing did in milliseconds? That's kind of whack.
So any way, sorry for misleading people in my last post. This makes me want to try an experiment where I recode my POI stuff using XML parsing rather than Custom Tags. I wonder if that would make it wicked fast.
Want to use code from this post? Check out the license.