Converting A Full CSS Selector To XPath Using ColdFusion
Now that we have a ColdFusion user defined function that converts a single element CSS selector to XPath, we can build on that foundation to convert a full CSS selector to XPath. Really, this is a rather small jump; all we have to do is handle the element delimiters and our previous UDF will take care of the heavy lifting. When it comes to descendent selection in CSS, I am only going to support two different kinds at this time:
- space = Any descendent selector
- > = Direct descendent selector (child)
I know that CSS can handle more than that (depending on the browser), but since we are keeping things simple for now, I am only going to think about these two common types. In terms of XPath syntax, these two relationships are quite easy to map:
- space ==> // (any descendent)
- > ==> / (direct descendent)
Ok so, keeping in mind that I have already defined the CSSElementSelectorToXPath() UDF, I am now defining the CSSSelectorToXPath() that builds on top of that to convert a full CSS selector to an XPath selector:
<cffunction | |
name="CSSSelectorToXPath" | |
access="public" | |
returntype="string" | |
output="false" | |
hint="I convert a full CSS selector to XPath (ex. div.header p span)."> | |
<!--- Define arguments. ---> | |
<cfargument | |
name="Selector" | |
type="string" | |
required="true" | |
hint="I am the full CSS selector." | |
/> | |
<!--- Define the local scope. ---> | |
<cfset var LOCAL = {} /> | |
<!--- Remove all extra white space. ---> | |
<cfset LOCAL.Selector = Trim( | |
REReplace( | |
ARGUMENTS.Selector, | |
"\s+", | |
" ", | |
"all" | |
) | |
) /> | |
<!--- | |
We are going to handle three different kinds of selection | |
delimiters: | |
[ ] = decendent | |
[>] = child | |
[,] = OR'ing two full selectors together. | |
Because we have three delimiters that mean different | |
things, we cannot treat this as a list. Rather, what we | |
need to do is capture all elements of the selector. | |
---> | |
<cfset LOCAL.SelectorParts = REMatch( | |
"(\s*>\s*)|(\s*,\s*)|(\s+)|([^\s,>]+)", | |
ARGUMENTS.Selector | |
) /> | |
<!--- Create an array of XPath selection parts. ---> | |
<cfset LOCAL.XPathParts = [] /> | |
<!--- | |
Start off by adding an "anywhere" selector to the | |
XPath parts. This is because our CSS selector might | |
match anywhere within the XHTML document. | |
---> | |
<cfset LOCAL.XPathParts[ 1 ] = "//" /> | |
<!--- | |
Now, let's loop over the parts of the CSS selector and | |
convert those to their XPath equivalent. | |
---> | |
<cfloop | |
index="LOCAL.SelectorPart" | |
array="#LOCAL.SelectorParts#"> | |
<!--- Trim this selection part. ---> | |
<cfset LOCAL.SelectorPart = Trim( LOCAL.SelectorPart ) /> | |
<!--- | |
Check to see if we have a direct decendent | |
delimiter. If so, we simply need to add a slash | |
to the XPath parts. | |
---> | |
<cfif (LOCAL.SelectorPart EQ ">")> | |
<!--- Add child tag XPath selector. ---> | |
<cfset ArrayAppend( | |
LOCAL.XPathParts, | |
"/" | |
) /> | |
<cfelseif (LOCAL.SelectorPart EQ "")> | |
<!--- Add decendant XPath selector. ---> | |
<cfset ArrayAppend( | |
LOCAL.XPathParts, | |
"//" | |
) /> | |
<cfelseif (LOCAL.SelectorPart EQ ",")> | |
<!--- | |
Add OR XPath selector. Because we are beginng a | |
new selector, prepend the "anywhere" selector. | |
---> | |
<cfset ArrayAppend( | |
LOCAL.XPathParts, | |
"|//" | |
) /> | |
<cfelse> | |
<!--- | |
We have an actual element selector. Convert | |
this to XPath syntax and add it to the XPath | |
parts array. | |
---> | |
<cfset ArrayAppend( | |
LOCAL.XPathParts, | |
CSSElementSelectorToXPath( LOCAL.SelectorPart ) | |
) /> | |
</cfif> | |
</cfloop> | |
<!--- | |
Now that we have our XPath parts array, all we need to | |
do is join it to form our full XPath selection query. | |
---> | |
<cfreturn ArrayToList( LOCAL.XPathParts, "" ) /> | |
</cffunction> |
As you can see, not much going on here - we are basically replacing the delimiters using the above rules and passing off the element translation to our previous UDF. Because CSS selectors don't have an initial context, I am prepending "//" to the final XPath selection. This will allow our XPath selection to make its first match anywhere within the given XHTML document.
To test this, I set up the following code:
<cfoutput> | |
div<br /> | |
#CSSSelectorToXPath( "div" )#<br /> | |
<br /> | |
div p<br /> | |
#CSSSelectorToXPath( "div p" )#<br /> | |
<br /> | |
div p strong<br /> | |
#CSSSelectorToXPath( "div p strong" )#<br /> | |
<br /> | |
##data-form label<br /> | |
#CSSSelectorToXPath( "##data-form label" )#<br /> | |
<br /> | |
div > p<br /> | |
#CSSSelectorToXPath( "div > p" )#<br /> | |
<br /> | |
div p.stanza > strong<br /> | |
#CSSSelectorToXPath( "div p.stanza > strong" )#<br /> | |
</cfoutput> |
And, when we run the above test code, we get the following output:
div
//divdiv p
//div//pdiv p strong
//div//p//strong#data-form label
//*[ @id = "data-form" ) ]//labeldiv > p
//div/pdiv p.stanza > strong
//div//p[ contains( @class, "stanza" ) ]/strong
The full CSS selectors are getting converted to proper XPath syntax. So far so good, now on to the next step.
Want to use code from this post? Check out the license.
Reader Comments
I would be remiss in my duties if I didn't point out that you should talk like a pirate and use → instead of ==> .
@Rick,
Ha ha, I actually know what you're referring to :)
NOTE: I have updated the UDF above to handle the "or" selector (,).