This isn't necessarily an "Ask Ben" question; Michael Appenzellar had brought up the concept of breaking up an SMS text message into multiple parts that were, at most, 120 characters each. He was having a bit of trouble breaking it up, so I thought I would throw together a quick little demo. To start with, let's create the text message that you might want to send to someone around this time of the year:
Launch code in new window » Download code as text file »
Don't pay attention to that REReplace() - that just takes the string stored using ColdFusion's CFSaveContent tag and strips out the extra tabbing and line breaks. I just like using CFSaveContent for formatting / display reasons.
Ok, now that we have our message, we want to break it up into 120 max-character SMS text messages. Initially, you might just try to use ColdFusion's Mid() function to grab every 120 character substring of the message:
Launch code in new window » Download code as text file »
On paper, this looks good, but when you run it, you see that it's not quite ideal. We end up splitting the message up into these three segments:
Deborah, thank you so much for coming over for Christmas celebrations. I had quite a fabulous time. I hope that the pres
ent I got for you was not offensive; I just fancy you rather attractive and I could only imagine that that kind of outfi
t would have looked insanely delicious on you. Happy Holidays.
As you can see, the word "present" in the first line and the word "outfit" in the second line are split between two SMS text messages. The problem here is that Mid() has no context; it has no understanding of the problem in which it is being used. As such, it doesn't care about splitting words.
Now, you could take that and start adding a bunch of logic to back track characters until you hit a space and then adjust your start offset and stuff. That can all get sticky. The easier approach is to leverage the robust rules that can be applied using Regular Expressions. We can think of our SMS message segments as consisting of a pattern and that pattern is that the captured match must be at most 120 words and must end on an appropriate character (meaning, it cannot end in the middle of a word).
I am going to arbitrarily say that a word is considered "split in half" if the next matched character is NOT a space, dash, colon, or "end of string" character. Anything that does not follow this rule must remain grouped together. To apply this kind of pattern rule, we are going to use a positive look ahead:
.{1,120}(?=([\s\-:]|$))
Now, using that pattern in conjunction with ColdFusion 8's new REMatch() function makes this almost too easy:
Launch code in new window » Download code as text file »
Running this code, we get the following, more appropriate output:
Deborah, thank you so much for coming over for Christmas celebrations. I had quite a fabulous time. I hope that the
present I got for you was not offensive; I just fancy you rather attractive and I could only imagine that that kind of
outfit would have looked insanely delicious on you. Happy Holidays.
Notice that this time, both the words "present" and "outfit" remain in tact, but moved completely to the next SMS text message. Works like a charm. And, since regular expression pattern matching always picks up where it left off, you never have to worry about word wrapping conflicting with the next segment match.
I hope that helps in some way.
Download Code Snippet ZIP File
Comments (2) | Post Comment | Ask Ben | Permalink | Other Searches | Print Page
ColdFusion 8's CFLoop Passes Array By Reference?
Using XmlSearch() In CFLoop Array (Thanks Scott Bennett!)
Nice! I'd been looking for something like this a while back. I'll keep this code handy :)
Posted by Gareth on Dec 27, 2007 at 3:04 PM
@Gareth,
No problem.
Posted by Ben Nadel on Dec 27, 2007 at 4:18 PM