Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at CFUNITED 2010 (Landsdown, VA) with: Mike Collins

Javascript Multiline Regular Expressions Don't Include Carriage Returns In IE

By Ben Nadel on

In a Regular Expression (RegEx) pattern, the ^ and $ characters typically match the start and end of an entire string. However, if you run a regular expression pattern in "Multiline" mode, the ^ and $ characters should match the start and end of each individual line, respectively. This is a pattern construct that I typically use on the server-side for data file parsing. On the client side, however, I very rarely use it. And, because of this seldom usage, I tend to forget that client-side support for multiline patterns is not universally consistent.

Case in point, last week I discovered some buggy behavior in my jQuery Template Markup Language (JTML) project. In the underlying rendering engine, JTML compiles down to an executable Javascript function in which each line of the JTML template is written to an output buffer in order to reduce string concatenation costs. The individual template lines were extracted using a multiline regular expression. This worked perfectly in Firefox, but created unterminated string constant Javascript errors in IE.

At first, debugging this problem was very frustrating because it appeared that both Firefox and IE supported multiline regular expressions. And, in fact, they do. But, they do not support these pattern constructs in the same capacity. After much alert()'ing and console.log()'ing, I finally figured out what the difference was - Internet Explorer (IE) does not include carriage returns (\r) in its multiline match delimiters. As such, those \r characters were being compiled down into mid-string line breaks, which is what was causing the unterminated string errors.

To see this in action, I am going to loop over the lines of a given Script tag using a multiline regular expression:

  • <!DOCTYPE HTML>
  • <html>
  • <head>
  • <title>Javascript Multline Regular Expression</title>
  • </head>
  • <body>
  •  
  • <h1>
  • Javascript Multline Regular Expression
  • </h1>
  •  
  • <!-- This is our input data. -->
  • <script id="template" type="text/jtml">
  • This data
  • is spread across
  • multiple lines.
  • </script>
  •  
  • <!-- This is our output element. -->
  • <form>
  • <textarea
  • id="output"
  • style="width: 500px ; height: 100px ;">
  • </textarea>
  • </form>
  •  
  •  
  • <script type="text/javascript">
  •  
  • // Grab the HTML of the template node.
  • var jtml = document.getElementById( "template" ).innerHTML;
  •  
  • // Grab the FORM output.
  • var output = document.getElementById( "output" );
  •  
  • // Create a counter for the number of lines found.
  • var lineCount = 0;
  •  
  • // Iterate over the JTML content in MULTILINE mode; this
  • // should match the
  • jtml.replace(
  • new RegExp( "^(.*)$", "gm" ),
  • function( $0 ){
  • // Append mached line to output.
  • output.value += $0;
  •  
  • // Increment line count.
  • lineCount++;
  • }
  • );
  •  
  • // Append line count to output.
  • output.value += lineCount;
  •  
  • </script>
  •  
  • </body>
  • </html>

As you can see, as I am matching the individual lines in the Script tag, I am outputting them to the Textarea output and incrementing my line count. When I run this in Firefox, I get the following page output:

 
 
 
 
 
 
Javascript Multiline Regular Expression Support In Firefox uses Carriage Return And New Line Characters. 
 
 
 

As you can see, Firefox found 5 individual lines in the Script tag. And, since it used both the carriage return and the new line characters as multiline delimiters, the resultant textarea has no hard line breaks.

On the other hand, when we run the above code in Internet Explorer (IE), we get the following page output:

 
 
 
 
 
 
Javascript Multine Regular Expression Support In Internet Explorer (IE) Does Not Include Carriage Return Characters. 
 
 
 

This is a very different story. As you can see, Internet Explorer also found multiple, individual lines; but, it found 11 lines rather than just 5. This is because it did not include the carriage return (\r) character in the multiline pattern delimiter. As such, the resultant textarea does contain hard line breaks as well as lines consisting of just the \r character (hence the additional line count).

NOTE: Some of the line count in IE can be reduced by using the (+) qualifier rather than the (*) qualifier in the matching regular expression.

I've had multiline problems before. But, as I was saying, I don't use multiline regular expressions very often in Javascript. Hopefully, this time, I'll remember that even in the most modern browsers, they are not quite supported consistently enough for use.




Reader Comments

That's quite alarming. I've used regular expressions to seperate lines before.
Gotta take a deeper look into that...

Thanks for pointing it out Ben!

Reply to this Comment

@Martin,

Yeah, this is frustrating stuff. There are some other odd Javascript RegExp differences in the other browsers, specifically with looping and exec(). This seems like the kind of thing that should be pretty universal.

Reply to this Comment

Post A Comment

You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.