Issues with Back Slashes (\) And Java's Matcher::AppendReplacement() Method
Posted September 10, 2006 at 11:23 AM by Ben Nadel
I just talked about the problems using $ signs and AppendReplacement(), but ironically, that post in and of itself, broke my code. Since my code had a slot of back slashes literals in it, the AppendReplacement() method was trying to evaluate them as esacpING characters, NOT escapED characters.
To quickly fix this, I updated the code:
- // Loop over the matcher while we still have matches.
- while ( LOCAL.Matcher.Find() ){
-
- // Get the sample.
- LOCAL.Sample = LOCAL.Matcher.Group();
-
- ... MANIPULATION of Group Data ...
-
- // Add the sample to the buffer. When appending the
- // replacement, be sure to escape any $ signs with
- // literal $ signs so that they are not evaluated are
- // regular expression groups. Also, escape the \ so
- // that they are not evaluated as escaping characters.
- LOCAL.Matcher.AppendReplacement(
- LOCAL.Results,
- LOCAL.Sample.ReplaceAll( "\\", "\\\\" ).ReplaceAll( "\$", "\\\$" )
- );
-
- }
This escapes all back slash literals in my string before it goes through and escapes all dollar sign literals.
Reader Comments
It should be, in fact, replaceAll( "\\\\", "\\\\\\\\" )
@Vlad,
I am not sure that I understand what you mean? I have tested my code and it works fine.
May be I'm missing something, but it can't be compiled as replaceAll("\\","\\\\"), it would give:
<code>
Exception in thread "main" java.util.regex.PatternSyntaxException: Unexpected internal error near index 1 \^
at java.util.regex.Pattern.error(Unknown Source)
at java.util.regex.Pattern.compile(Unknown Source)
at java.util.regex.Pattern.<init>(Unknown Source)
at java.util.regex.Pattern.compile(Unknown Source)
at java.lang.String.replaceAll(Unknown Source)
</code>
It makes sense to me, as you have to escape it twice: once for string and once for regex.
Not sure how it works for you.
@Vlad,
Ahh, I see what's going on here. This is not Java code. This is ColdFusion code (which automatically compiles down to Java). In ColdFusion, there are no special string characters; as such, I only have to escape the value for the RegularExpression, not for the string itself.
Simple miscommunication :)
My bad, I missed that.
I see I'm not the first one to make that mistake in your blog :)
Your pages come up nicely when querying google for java issues
Thanks
@Vlad,
No worries my friend. Heck, I didn't even mention the language in the blog post. That's pretty exciting for me, though, that this kind of stuff comes up for Java searches as well :)
Not to nitpick, but wouldn't '#' be considered a special string character in ColdFusion?
@KingErroneous,
Yeah, # is definitely a special character in ColdFusion, but only in a string that is going to be evaluated. Having a # in a piece of string data (such as that read from a file or in a FORM post), there is not special about it. And, of course, to escape it in ColdFusion, you need to use a double-pound, ##, rather than a back-slash as you might in other languages.



