Finding Shallow HTML Comment Nodes In The DOM Using TreeWalker
The other day, I starting playing around with the TreeWalker API as a way to iterate over HTML comment nodes contained within a given DOM (Document Object Model) node. When I first started tinkering, it didn't look like there was any way to perform a "shallow" search (ie, only look at child nodes). However, I now realize that if I widen my net of node types, I can perform a shallow search for comment nodes.
In my first approach, I was only looking at comment nodes. This made it difficult to constrain the search to the immediate children of the root node. Sure, I could use the filter() method to reject any comment node whose parent was not the root node; but, that would still require iterating over a deep-search of the comments, which felt sub-optimal.
The key to a shallow search for comment nodes does require filtering; but, it also requires a looser search. Instead of just searching for comment nodes, I have to search for both comment nodes and element nodes. Then, I need to use the filter method to skip over any element nodes (and thereby prevent the TreeWalker from following deep tree branches).
As you're filtering nodes in the TreeWalker, there are actually two different forms of "skip":
- FILTER_SKIP - Value to be returned by NodeFilter.acceptNode() for nodes to be skipped by the NodeIterator or TreeWalker object. The children of skipped nodes are still considered. This is treated as "skip this node but not its children".
- FILTER_REJECT - Value to be returned by the NodeFilter.acceptNode() method when a node should be rejected. The children of rejected nodes are not visited by the NodeIterator or TreeWalker object; this value is treated as "skip this node and all its children".
Notice that "Reject" will prevent the TreeWalker from going down into a given element node. We can use this to our benefit; if we start searching for comment nodes and element nodes, but reject all elements, it will keep the search shallow and will only find comments.
To see this in action, take a look at the following code:
As you can see, we configure the TreeWalker to find both comment nodes and element nodes. Of course, we accept all comment nodes and skip all element nodes; but, if we're doing a shallow search, we full-on "reject" element nodes to prevent a deep walk of the DOM.
So, performing a shallow search for comment nodes is possible with the TreeWalker. But, I'm not sure I would ever use it for this [shallow searching]. There's a lot of cruft here and a lot of logic and the overhead of calling a function on each encountered node. And, when compared to simply grabbing the child nodes and plucking the comments (what jQuery.fn.comments() does), I'm not sure the TreeWalker represents a "win" in this context.
Want to use code from this post? Check out the license.
Now this is some great stuff .... Pure JS DOM traversal feels a bit cleaner than using a library... (a lot more boilerplate... but cool to do ...)
Moreover... TreeWalker is very fast... Here's a jsperf comparing jQuery's remove to a native implementation... The TreeWalker implementation is much faster...