Skip to main content
Ben Nadel at cf.Objective() 2013 (Bloomington, MN) with: Nathan Deneau
Ben Nadel at cf.Objective() 2013 (Bloomington, MN) with: Nathan Deneau

Finding HTML Comment Nodes In The DOM Using TreeWalker

By
Published in Comments (14)

The other day, I blogged about my jQuery plugin that facilitates searching the DOM (Document Object Model) for Comment nodes based on pseudo attributes. While I was writing that plugin, I came across a relatively new JavaScript API, the TreeWalker. The TreeWalker provides an iteration interface for easily traversing specific components of the DOM tree. Given the fact that comments cannot be found with normal CSS selectors, I wanted to see if I could use the TreeWalker as a comment-query optimization.

To be clear, the TreeWalker can iterate over any kind of (or combination of) DOM nodes. But, Comment nodes seem like the most interesting topic since they cannot be easily targeted through other means (hence my jQuery plugin).

When you instantiate a TreeWalker, you give it the root node (within which to search) and the type of nodes that you want to find - in this case, comments. You can also pass-in an optional filter that will allow you to skip (or reject) nodes in the iteration process. There is some cross-browser funkiness in the way the filtering is defined; but, it's rather easy to overcome.

IE (Internet Explorer) expects a filter function to be defined. Other browsers expect an object that contains an "acceptNode" method. To overcome this, we can simply define the filter function as a member of itself. Remember, in JavaScript, Functions are just objects that can be invoked. And, like all objects in JavaScript, Functions can have properties.

So, for cross browser support, we just define a reference to the filter function on itself:

filter.acceptNode = filter;

In this way, "filter" is both an object and a function.

To see this in action, I'm going to search the DOM for comments; then, for each comment that I find, I'm going to insert a Paragraph element with the same text value. Notice that I'm skipping the last comment in the DOM:

<!doctype html>
<html>
<head>
	<meta charset="utf-8" />

	<title>
		Finding Comment Nodes In The DOM Using TreeWalker
	</title>
</head>
<body>

	<h1>
		Finding Comment Nodes In The DOM Using TreeWalker
	</h1>

	<!-- Comment 1: In the Body. -->

	<div>

		<!-- Comment 2: In a nested Div. -->

		<div>

			<!-- Comment 3: In a double-nested Div. -->

		</div>

	</div>

	<!-- Comment 4: Back up in that Body. -->


	<!-- Load scripts. -->
	<script type="text/javascript" src="../../vendor/jquery/jquery-2.0.3.min.js"></script>
	<script type="text/javascript">

		if ( ! document.createTreeWalker ) {

			throw( new Error( "Browser does not support createTreeWalker()." ) );

		}


		// By default, the TreeWalker will show all of the matching DOM nodes that it
		// finds. However, we can use an optional "filter" method that will inform the
		// DOM traversal.
		function filter( node ) {

			if ( node.nodeValue === " Load scripts. " ) {

				return( NodeFilter.FILTER_SKIP );

			}

			return( NodeFilter.FILTER_ACCEPT );

		}

		// IE and other browsers differ in how the filter method is passed into the
		// TreeWalker. Mozilla takes an object with an "acceptNode" key. IE takes the
		// filter method directly. To work around this difference, we will define the
		// acceptNode function a property of itself.
		filter.acceptNode = filter;

		// NOTE: The last argument [] is a deprecated, optional parameter. However, in
		// IE, the argument is not optional and therefore must be included.
		var treeWalker = document.createTreeWalker(
			document.body,
			NodeFilter.SHOW_COMMENT,
			filter,
			false
		);

		// For each comment node, add a
		while ( treeWalker.nextNode() ) {

			$( "<p></p>" )
				.text( treeWalker.currentNode.nodeValue )
				.insertAfter( treeWalker.currentNode )
			;

		}

	</script>

</body>
</html>

When we run this code, we get the following page output:

Using the TreeWalker to find comments in a DOM.

As you can see, we were able to locate each Comment node using the TreeWalker; and, we were able to insert a P-tag after each comment. A subtle, but important fact here is that the TreeWalker can iterate over a dynamic DOM tree. Meaning, it had no trouble iterating over the DOM even as we were changing it (ie, inserting P-tags).

After finding this, I added some branching logic to my jQuery.fn.comments() plugin that will use the TreeWalker if it's available.

Want to use code from this post? Check out the license.

Reader Comments

15,902 Comments

@Ray,

Good point; I'm generally bad about that kind of stuff. On a related note, the MDN stuff is so great. Whenever I look up stuff in JS, I start with the MDN prefix, as in Googling for:

"mdn TreeWalker"

Such a great resource!

290 Comments

@Ben,

Instead of "treeWalker" (var treeWalker = document.createTreeWalker), I'd like to suggest calling your variables "triffid", "birnamWood", "Ent", "treeBeard", "Fangorn" or "fangornForest".

Odd that literature has given us so many tree walkers.

290 Comments

@Ben,

Day of the Triffids (novel by John Wyndham), MacBeth and Lord of the Rings. Apparently, trees uprooting themselves and walking around is a thing. Also an army disguising themselves as a forest (MacBeth).

Based on your book and movie choices in the past, I really think you would like the Day of the Triffids. You like things that make you think and see things in a new way. John Wyndham likes to explore themes of survival, what traits does it take to survive when the world radically changes. Much of his stuff gets made into movies, though the subtext about survival traits doesn't always make it onto the screen. Village of the Damned was also based on one of his novels. It's been made into a movie, like, 5 times.

15,902 Comments

@WebManWalking,

I just watched trailers for the movie versions of said items. The Day of the Triffids looks very interesting. The Village of the Damned... not sure I feel about the John Carpenter version of that one :)

290 Comments

Oh you definitely need to see the 1960 version. It got nominated for 2 Hugo awards:

http://www.imdb.com/title/tt0054443/

The book was called The Midwich Coocoos. The coocoo bird lays its eggs into another bird's nest, to get the duped bird to feed/raise the coocoo's offspring. I think Stephen King semi-stole that idea for "The Dome": alien creature(s) with telepathy and ability to invade humans' minds, dome around a town, etc. John Wyndham's estate should demand royalties.

What was this all about? Oh yeah, tree walkers. Sorry. I digressed.

15,902 Comments

@WebManWalking,

It's funny to see how movie trailers have changed so much over time (link on IMDB). The super-dramatic voice-overs that we have now make the ones back them seems comical. It's hard not to imagine this being watched as some part of a Mystery Science Theater 3,000 :D

I believe in love. I believe in compassion. I believe in human rights. I believe that we can afford to give more of these gifts to the world around us because it costs us nothing to be decent and kind and understanding. And, I want you to know that when you land on this site, you are accepted for who you are, no matter how you identify, what truths you live, or whatever kind of goofy shit makes you feel alive! Rock on with your bad self!
Ben Nadel