Skip to main content
Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.

Application Setting "useJavaAsRegexEngine" Tells CFML To Use Java's RegEx Engine For Built-In Re-Functions In Adobe ColdFusion 2018

By Ben Nadel on
Tags: ColdFusion

So, yesterday, when I was looking something up in the Adobe ColdFusion RegEx documentation, I saw something that blew my mind! Apparently, as of Update 5, there is now a ColdFusion 2018 Application setting - useJavaAsRegexEngine - that tells the CFML runtime to use Java's Regular Expression engine when executing built-in functions like reFind(), reMatch(), and reReplace(). While the default POSIX pattern matching engine is pretty great, it's less robust and less powerful than the Java pattern matching engine. As such, this will be a much welcome change to the Adobe ColdFusion community.

To see an example of this change, let's look at using a positive look-behind. In a Regular Expression pattern, a positive look-behind allows us to capture a value, but only if it immediately follows another pattern. And, it does so without us having to capture the "preceding part" of said pattern.

So, in the following code, we're going to try and capture the term pajamas; but, only if it immediately follows the term cat's:

<cfscript>

	value = "You are the cat's pajamas!";

	// Normally, the POSIX-based Regular Expression engine that Adobe ColdFusion uses
	// under the hood doesn't support things like LOOK-BEHINDS. As such, the following
	// RegEx pattern - which is looking for the term "pajamas", but only if it comes
	// right after the term "cat's " - will throw an error:
	writeDump( value.reMatch( "(?<=cat's )pajamas" ) );

</cfscript>

Here, we're using the built-in Adobe ColdFusion function reMatch(). And, if we run this in Adobe ColdFusion 2018 - without any custom settings - we get the following ColdFusion error:

Adobe ColdFusion 2018 is throwing an error about malformed RegEx patterns that contain positive look-behinds.

Malformed regular expression "(?<=cat's )pajamas".

Reason: Sequence (?<...) not recognized.

As you can see, the default POSIX RegEx engine that Adobe ColdFusion uses does not support look-behinds. But, if we create an Application.cfc ColdFusion framework component, we can switch our RegEx engine:

component
	output = false
	hint = "I define the application settings and event-handlers."
	{

	this.name = "RegExTesting";

	// Tell Adobe ColdFusion 2018, Update 5, to use the Java Regular Expression engine
	// for all of its built-in re-Functions (like, reFind() and reMatch()).
	this.useJavaAsRegexEngine = true;

}

Now, with the useJavaAsRegexEngine application setting enabled, if we re-run the same code from above, we get the following output:

Adobe ColdFusion 2018 supports positive look-behinds in RegEx patterns if the useJavaAsRegexEngine Application setting is enabled.

Oh sweet chickens! As you can see, with the useJavaAsRegexEngine ColdFusion application setting enabled, the look-behind works; and, we were able to locate and extract the phrase pajamas!

At the time of this writing, Lucee CFML does not yet support this feature. However, I did locate a JIRA ticket - LDEV-2495 - which has this feature listed as a compatibility issue. As such, I fully expect this to become available in a future Lucee CFML release.



Reader Comments

@Zac,

You guys are so fast. It's definitely a fascinating feature. It's not without it's issues; but, I suspect that anything that can be set globally to an application has some degree of trade-offs. Personally, I just love the Java RegEx engine!

Reply to this Comment

Any resources for the differences between the RegEx engines? I assume there are things that won't work with the Java RegEx that would work with the default engine.

Reply to this Comment

@Tyler,

Great question! Yes and No. Yes in that the Adobe site appears to have a comparison of the Java and Perl pattern engines in their documentation. However, No, in that the list on the aforementioned docs doesn't appear to be valid. For example, the list in those docs mention that both engines support a "look-behind" of fixed length. However, as I demonstrated in this post, that is clearly not the case (as it can't even parse such a pattern in POSIX). So, I am not sure what else is wrong in those docs.

Reply to this Comment

Post A Comment

You — Get Out Of My Dreams, Get Into My Blog
Live in the Now
Oops!
NEW: Some basic markdown formatting is now supported: bold, italic, blockquotes, lists, fenced code-blocks. Read more about markdown syntax »
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.