Converting A Full CSS Selector To XPath Using ColdFusion
Posted March 13, 2009 at 6:51 PM by Ben Nadel
Now that we have a ColdFusion user defined function that converts a single element CSS selector to XPath, we can build on that foundation to convert a full CSS selector to XPath. Really, this is a rather small jump; all we have to do is handle the element delimiters and our previous UDF will take care of the heavy lifting. When it comes to descendent selection in CSS, I am only going to support two different kinds at this time:
- space = Any descendent selector
- > = Direct descendent selector (child)
I know that CSS can handle more than that (depending on the browser), but since we are keeping things simple for now, I am only going to think about these two common types. In terms of XPath syntax, these two relationships are quite easy to map:
- space ==> // (any descendent)
- > ==> / (direct descendent)
Ok so, keeping in mind that I have already defined the CSSElementSelectorToXPath() UDF, I am now defining the CSSSelectorToXPath() that builds on top of that to convert a full CSS selector to an XPath selector:
- <cffunction
- name="CSSSelectorToXPath"
- access="public"
- returntype="string"
- output="false"
- hint="I convert a full CSS selector to XPath (ex. div.header p span).">
-
- <!--- Define arguments. --->
- <cfargument
- name="Selector"
- type="string"
- required="true"
- hint="I am the full CSS selector."
- />
-
- <!--- Define the local scope. --->
- <cfset var LOCAL = {} />
-
- <!--- Remove all extra white space. --->
- <cfset LOCAL.Selector = Trim(
- REReplace(
- ARGUMENTS.Selector,
- "\s+",
- " ",
- "all"
- )
- ) />
-
- <!---
- We are going to handle three different kinds of selection
- delimiters:
-
- [ ] = decendent
- [>] = child
- [,] = OR'ing two full selectors together.
-
- Because we have three delimiters that mean different
- things, we cannot treat this as a list. Rather, what we
- need to do is capture all elements of the selector.
- --->
- <cfset LOCAL.SelectorParts = REMatch(
- "(\s*>\s*)|(\s*,\s*)|(\s+)|([^\s,>]+)",
- ARGUMENTS.Selector
- ) />
-
-
- <!--- Create an array of XPath selection parts. --->
- <cfset LOCAL.XPathParts = [] />
-
- <!---
- Start off by adding an "anywhere" selector to the
- XPath parts. This is because our CSS selector might
- match anywhere within the XHTML document.
- --->
- <cfset LOCAL.XPathParts[ 1 ] = "//" />
-
-
- <!---
- Now, let's loop over the parts of the CSS selector and
- convert those to their XPath equivalent.
- --->
- <cfloop
- index="LOCAL.SelectorPart"
- array="#LOCAL.SelectorParts#">
-
- <!--- Trim this selection part. --->
- <cfset LOCAL.SelectorPart = Trim( LOCAL.SelectorPart ) />
-
- <!---
- Check to see if we have a direct decendent
- delimiter. If so, we simply need to add a slash
- to the XPath parts.
- --->
- <cfif (LOCAL.SelectorPart EQ ">")>
-
- <!--- Add child tag XPath selector. --->
- <cfset ArrayAppend(
- LOCAL.XPathParts,
- "/"
- ) />
-
- <cfelseif (LOCAL.SelectorPart EQ "")>
-
- <!--- Add decendant XPath selector. --->
- <cfset ArrayAppend(
- LOCAL.XPathParts,
- "//"
- ) />
-
- <cfelseif (LOCAL.SelectorPart EQ ",")>
-
- <!---
- Add OR XPath selector. Because we are beginng a
- new selector, prepend the "anywhere" selector.
- --->
- <cfset ArrayAppend(
- LOCAL.XPathParts,
- "|//"
- ) />
-
- <cfelse>
-
- <!---
- We have an actual element selector. Convert
- this to XPath syntax and add it to the XPath
- parts array.
- --->
- <cfset ArrayAppend(
- LOCAL.XPathParts,
- CSSElementSelectorToXPath( LOCAL.SelectorPart )
- ) />
-
- </cfif>
-
- </cfloop>
-
-
- <!---
- Now that we have our XPath parts array, all we need to
- do is join it to form our full XPath selection query.
- --->
- <cfreturn ArrayToList( LOCAL.XPathParts, "" ) />
- </cffunction>
As you can see, not much going on here - we are basically replacing the delimiters using the above rules and passing off the element translation to our previous UDF. Because CSS selectors don't have an initial context, I am prepending "//" to the final XPath selection. This will allow our XPath selection to make its first match anywhere within the given XHTML document.
To test this, I set up the following code:
- <cfoutput>
-
- div<br />
- #CSSSelectorToXPath( "div" )#<br />
- <br />
-
- div p<br />
- #CSSSelectorToXPath( "div p" )#<br />
- <br />
-
- div p strong<br />
- #CSSSelectorToXPath( "div p strong" )#<br />
- <br />
-
- ##data-form label<br />
- #CSSSelectorToXPath( "##data-form label" )#<br />
- <br />
-
- div > p<br />
- #CSSSelectorToXPath( "div > p" )#<br />
- <br />
-
- div p.stanza > strong<br />
- #CSSSelectorToXPath( "div p.stanza > strong" )#<br />
-
- </cfoutput>
And, when we run the above test code, we get the following output:
div
//div
div p
//div//p
div p strong
//div//p//strong
#data-form label
//*[ @id = "data-form" ) ]//label
div > p
//div/p
div p.stanza > strong
//div//p[ contains( @class, "stanza" ) ]/strong
The full CSS selectors are getting converted to proper XPath syntax. So far so good, now on to the next step.
Reader Comments
I would be remiss in my duties if I didn't point out that you should talk like a pirate and use → instead of ==> .
@Rick,
Ha ha, I actually know what you're referring to :)
NOTE: I have updated the UDF above to handle the "or" selector (,).



