Using POSIX Character Classes In Java Regular Expressions With ColdFusion

Posted February 6, 2009 at 3:31 PM by Ben Nadel

Tags: ColdFusion

When I first started to learn regular expressions in ColdFusion, I used things called POSIX Character Classes. These were pre-defined groups of characters that looked like:

  • [:digit:]
  • [:alnum:]
  • [:punct:]

When I started working in Java Regular Expressions, I could no longer use those characters classes. Or rather, I couldn't use them with the same notation - POSIX character classes in Java regular expressions have a different notation:

  • \p{Digit}
  • \p{Alnum}
  • \p{Punct}

I was just looking up some regular expression stuff when I saw these POSIX character classes again. I've never actually used them in Java, so I figured I would take a moment to try them out:

  • <!--- Save some sample text. --->
  • <cfsavecontent variable="strSample">
  •  
  • "You can't just pick and choose which laws to follow. Sure
  • I'd like to tape a baseball game without the express written
  • consent of major league baseball, but that's just not the
  • way it works." - Hank Hill
  •  
  • </cfsavecontent>
  •  
  •  
  • <!--- Replace graphical characters. --->
  • #strSample.ReplaceAll(
  • JavaCast( "string", "\p{Graph}+" ),
  • JavaCast( "string", "X" )
  • )#
  •  
  • <br />
  • <br />
  •  
  • <!--- Replace the punctuation. --->
  • #strSample.ReplaceAll(
  • JavaCast( "string", "\p{Punct}+" ),
  • JavaCast( "string", "_" )
  • )#
  •  
  • <br />
  • <br />
  •  
  • <!--- Replace all punctuation except apostrophes. --->
  • #strSample.ReplaceAll(
  • JavaCast( "string", "[\p{Punct}&&[^']]+" ),
  • JavaCast( "string", "_" )
  • )#

When we run this code, we get the following output:

X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X

_You can_t just pick and choose which laws to follow_ Sure I_d like to tape a baseball game without the express written consent of major league baseball_ but that_s just not the way it works_ _ Hank Hill

_You can't just pick and choose which laws to follow_ Sure I'd like to tape a baseball game without the express written consent of major league baseball_ but that's just not the way it works_ _ Hank Hill

The casing of the character class in POSIX is important. \p{Digit} works fine, but \p{DIGIT} will throw an error. All of the POSIX classes can be replaced with shorter, more standard character classes so I don't really see much of a need for these; but, the one I see as having some value is the Puntuation character class - there's just too many of those darn characters to type out!



Reader Comments

Mar 2, 2011 at 5:18 PM // reply »
3 Comments

Shoot, since nobody else has commented...

How do you suggest I remove '&nbsp;' from a string without removing actual spaces, ' '?

This does not seem to work...
#replace(qSelect.location1, "&nbsp;", "", "ALL")#

Thank you!!!


Post A Comment

Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.

Please review the following issues:

Author Name:


Author Email:

Author Website:

Comment:

Supported HTML tags for formatting: <strong>bold</strong>   <em>italic</em>   <code>code</code>







  • Help Wanted - Find Your Next ColdFusion Job
InVision App - Prototyping Made Beautiful With Prototyping Tools Ben Nadel's Company - Epicenter Consulting Recent Blog Comments
Feb 10, 2012 at 7:21 PM
jQuery AJAX Strips Script Tags And Inserts Them After Parent-Most Elements
Update! Instead of $(eval(options.insertAfter)).after(data['insertData']); I now use: var ajaxNode = document.createElement('span'); var parent = $(eval(options.insertAfter))[0].parentNode; ... read »
Feb 10, 2012 at 6:18 PM
jQuery AJAX Strips Script Tags And Inserts Them After Parent-Most Elements
encountered this same, what I consider, jQuery bug last week. I'm building a site in which I load some content via AJAX. This content contains Linkedin share button placeholders which Linkedin API ne ... read »
Feb 10, 2012 at 11:30 AM
Cross-Origin Resource Sharing (CORS) AJAX Requests Between jQuery And Node.js
After you understand the concepts here, this is an awesome cheatsheet for enabling CORS in just about anything http://enable-cors.org/ ... read »
JM
Feb 10, 2012 at 9:10 AM
My Safari Browser SQLite Database Hello World Example
@Amy, Here is a very good tutorial on how to use JOIN: http://www.sqltutorial.org/sqljoin-innerjoin.aspx ... read »
Feb 10, 2012 at 4:42 AM
Building A Twitter-Inspired RESTful API Architecture In ColdFusion
This is great, very useful Ben. I spotted a small typo in the api.cgm listing: <cfthrow type="Unauthroized" /> Cheers Stefan ... read »
Feb 9, 2012 at 10:35 PM
CFDirectory Filtering Uses Pipe Character For Multiple Filters (Thanks Steve Withington)
I was wondering if there would be a filter you could apply so that you got everything but what you included in the filter. As in show me all docs that are not a .pdf. ... read »
Feb 9, 2012 at 10:29 PM
Learning ColdFusion 9: Application-Specific Data Sources
@Ben, No offence, but if people were really wanting advanced features they would be using a platform like ASP.NET MVC. CFML is so structurally compromised as a tag-based scripting language that ... read »
Feb 9, 2012 at 10:03 PM
Subversion - Cleanup Failed To Process The Following Paths
@Leviaguirre, do you still have problems with this? ... read »