Skip to main content
Ben Nadel at TechCrunch Disrupt (New York, NY) with: Danielle Morrill
Ben Nadel at TechCrunch Disrupt (New York, NY) with: Danielle Morrill ( @DanielleMORRILL )

Javascript Multiline Regular Expressions Don't Include Carriage Returns In IE

By
Published in Comments (4)

In a Regular Expression (RegEx) pattern, the ^ and $ characters typically match the start and end of an entire string. However, if you run a regular expression pattern in "Multiline" mode, the ^ and $ characters should match the start and end of each individual line, respectively. This is a pattern construct that I typically use on the server-side for data file parsing. On the client side, however, I very rarely use it. And, because of this seldom usage, I tend to forget that client-side support for multiline patterns is not universally consistent.

Case in point, last week I discovered some buggy behavior in my jQuery Template Markup Language (JTML) project. In the underlying rendering engine, JTML compiles down to an executable Javascript function in which each line of the JTML template is written to an output buffer in order to reduce string concatenation costs. The individual template lines were extracted using a multiline regular expression. This worked perfectly in Firefox, but created unterminated string constant Javascript errors in IE.

At first, debugging this problem was very frustrating because it appeared that both Firefox and IE supported multiline regular expressions. And, in fact, they do. But, they do not support these pattern constructs in the same capacity. After much alert()'ing and console.log()'ing, I finally figured out what the difference was - Internet Explorer (IE) does not include carriage returns (\r) in its multiline match delimiters. As such, those \r characters were being compiled down into mid-string line breaks, which is what was causing the unterminated string errors.

To see this in action, I am going to loop over the lines of a given Script tag using a multiline regular expression:

<!DOCTYPE HTML>
<html>
<head>
	<title>Javascript Multline Regular Expression</title>
</head>
<body>

	<h1>
		Javascript Multline Regular Expression
	</h1>

	<!-- This is our input data. -->
	<script id="template" type="text/jtml">
		This data
		is spread across
		multiple lines.
	</script>

	<!-- This is our output element. -->
	<form>
		<textarea
			id="output"
			style="width: 500px ; height: 100px ;">
		</textarea>
	</form>


	<script type="text/javascript">

		// Grab the HTML of the template node.
		var jtml = document.getElementById( "template" ).innerHTML;

		// Grab the FORM output.
		var output = document.getElementById( "output" );

		// Create a counter for the number of lines found.
		var lineCount = 0;

		// Iterate over the JTML content in MULTILINE mode; this
		// should match the
		jtml.replace(
			new RegExp( "^(.*)$", "gm" ),
			function( $0 ){
				// Append mached line to output.
				output.value += $0;

				// Increment line count.
				lineCount++;
			}
		);

		// Append line count to output.
		output.value += lineCount;

	</script>

</body>
</html>

As you can see, as I am matching the individual lines in the Script tag, I am outputting them to the Textarea output and incrementing my line count. When I run this in Firefox, I get the following page output:

Javascript Multiline Regular Expression Support In Firefox uses Carriage Return And New Line Characters.

As you can see, Firefox found 5 individual lines in the Script tag. And, since it used both the carriage return and the new line characters as multiline delimiters, the resultant textarea has no hard line breaks.

On the other hand, when we run the above code in Internet Explorer (IE), we get the following page output:

Javascript Multine Regular Expression Support In Internet Explorer (IE) Does Not Include Carriage Return Characters.

This is a very different story. As you can see, Internet Explorer also found multiple, individual lines; but, it found 11 lines rather than just 5. This is because it did not include the carriage return (\r) character in the multiline pattern delimiter. As such, the resultant textarea does contain hard line breaks as well as lines consisting of just the \r character (hence the additional line count).

NOTE: Some of the line count in IE can be reduced by using the (+) qualifier rather than the (*) qualifier in the matching regular expression.

I've had multiline problems before. But, as I was saying, I don't use multiline regular expressions very often in Javascript. Hopefully, this time, I'll remember that even in the most modern browsers, they are not quite supported consistently enough for use.

Want to use code from this post? Check out the license.

Reader Comments

29 Comments

That's quite alarming. I've used regular expressions to seperate lines before.
Gotta take a deeper look into that...

Thanks for pointing it out Ben!

15,798 Comments

@Martin,

Yeah, this is frustrating stuff. There are some other odd Javascript RegExp differences in the other browsers, specifically with looping and exec(). This seems like the kind of thing that should be pretty universal.

I believe in love. I believe in compassion. I believe in human rights. I believe that we can afford to give more of these gifts to the world around us because it costs us nothing to be decent and kind and understanding. And, I want you to know that when you land on this site, you are accepted for who you are, no matter how you identify, what truths you live, or whatever kind of goofy shit makes you feel alive! Rock on with your bad self!
Ben Nadel