Ray Camden's Friday Puzzler - I Finally Did It!
Posted April 1, 2007 at 9:37 PM by Ben Nadel
Ok, this has wasted waaay too much of my Sunday. Now, I am starving and need to go eat a whole chicken. But, at least it's done and I can sleep knowing that once again, all is well in the world.
As I described before, the rules here are the same: 10 to 15 random words (not contained within tag definitions) and for each word, 1 to 3 characters replaced out. Here is my initial HTML:
- <html>
- <head>
- <title>Ray's Puzzler - 03/30/2007</title>
- </head>
- <body>
-
- <p>
- My name is Bob. I am the coolest dude... perhaps ever?
- I am without a doubt the best programmer that has ever
- existed. No one will ever mess with my text cause I
- am feared in the office.
- </p>
-
- <p>
- I really like that new secretary, Niki. I should ask
- her out. Man, she would love that. But is she worthy?
- I mean, not just any old bird should get the privledge
- of going with me.
- </p>
-
- <p>
- Note to self: Fire Jen over in accounting. I don't like
- the way she dresses. Tell her the firing is over the
- payroll mess-up she made last month.
- </p>
-
- </body>
- </html>
-
-
- <!--- Get the update content. --->
- <cfset strUpdatedContent = MessWithBob() />
-
- <!--- Clear the buffer. --->
- <cfset GetPageContext().GetOut().ClearBuffer() />
-
- <!--- Write the new output. --->
- <cfset WriteOutput( strUpdatedContent ) />
... and running the above results in:
- <html>
- <head>
- <title>Ray's Puzzler - 03/30/2007</title>
- </head>
- <body>
-
- <p>
- My name is Bob. I ar the qoo46st dude... perhaps eoer?
- I am without 2 doubt the best programmer that has ever
- existed. No one will ever mess with my text 1ause l
- am feared in the office.
- </p>
-
- <p>
- m really like that new se5r0tgry, Niki. I should ask
- her out. Man, she would love that. But is she worthy?
- I mean, not just any old bird should hnt jot privledge
- of going with me.
- </p>
-
- <p>
- Note to 40lf: Fire Jen over rn accounting. I don'a like
- the way she dresses. Tell her the firing is over the
- payroll mess-up fdj made last month.
- </p>
-
- </body>
- </html>
Notice that 10-15 random words have been altered. Phew! Here's how it's done:
- <cffunction
- name="MessWithBob"
- access="public"
- returntype="string"
- output="false"
- hint="Ha ha ha ha ha... you are soo in trouble Bob!">
-
- <!--- Define arguments. --->
- <cfargument
- name="Text"
- type="string"
- required="false"
- default="#GetPageContext().GetOut().GetBuffer()#"
- hint="The text that we are going to mess with. Defaults to the current page's unflushed content buffer."
- />
-
-
- <!--- Define the local scope. --->
- <cfset var LOCAL = StructNew() />
-
-
- <!---
- Store a list of characters that are going to be used
- when replacing out random characters.
- --->
- <cfset LOCAL.CharSet = "abcdefghijklmnopqrstuvwxyz0123456789" />
-
-
- <!---
- Create an array reflectory utility class. We need to
- do this because we are going to be messing with Java
- arrays which cannot be updated directly (not sure why,
- but it's some sort of casting issue I think).
- --->
- <cfset LOCAL.ReflectArray = CreateObject(
- "java",
- "java.lang.reflect.Array"
- ) />
-
-
- <!---
- Create a pattern to match the tags. This regular
- expression allows closing brackets to be contained
- within attributes of the tag (I think).
- --->
- <cfset LOCAL.Pattern = CreateObject(
- "java",
- "java.util.regex.Pattern"
- ).Compile(
- "(?i)</?[a-z](""[^""]*""|[^>])*>"
- ) />
-
- <!---
- Get the matcher for our passed in text data and
- compiled pattern.
- --->
- <cfset LOCAL.Matcher = LOCAL.Pattern.Matcher(
- JavaCast( "string", ARGUMENTS.Text )
- ) />
-
-
- <!--- Create an array to hold our words. --->
- <cfset LOCAL.Words = ArrayNew( 1 ) />
-
- <!--- Create an array to hold our tags. --->
- <cfset LOCAL.Tags = ArrayNew( 1 ) />
-
-
- <!---
- Create a buffer to hold the modified text. We
- are going to use this buffer as we loop through
- our pattern matcher.
- --->
- <cfset LOCAL.Buffer = CreateObject(
- "java",
- "java.lang.StringBuffer"
- ).Init() />
-
-
- <!--- Loop over the tags in our text. --->
- <cfloop condition="#LOCAL.Matcher.Find()#">
-
- <!---
- Create a structure to hold information
- about this tag.
- --->
- <cfset LOCAL.Tag = StructNew() />
-
- <!--- Store the tag HTML and indexes. --->
- <cfset LOCAL.Tag.HTML = LOCAL.Matcher.Group() />
- <cfset LOCAL.Tag.Start = LOCAL.Matcher.Start() />
- <cfset LOCAL.Tag.End = LOCAL.Matcher.End() />
-
- <!--- Add this tag to the array. --->
- <cfset ArrayAppend(
- LOCAL.Tags,
- LOCAL.Tag
- ) />
-
- <!---
- Replace out the tag and add the content to the
- buffer. We are replacing the entire tag with
- "periods". This is done to ensure that later on,
- when we split the string on word boundaries, we
- know that we will never split it in the middle
- of a replaced out tag. Futhermore, we will never
- have to do any additional "word" checking as
- there are no word characters in this.
- --->
- <cfset LOCAL.Matcher.AppendReplacement(
- LOCAL.Buffer,
- RepeatString(
- ".",
- Len( LOCAL.Matcher.Group() )
- )
- ) />
-
- </cfloop>
-
-
- <!---
- Add the remaing text to our modification buffer. This
- will catch any words that came after the final tag.
- --->
- <cfset LOCAL.Matcher.AppendTail(
- LOCAL.Buffer
- ) />
-
-
- <!---
- ASSERT: At this point, our buffer should contain all
- the same passed in content, except the tags have been
- stripped out and replaced with periods.
- --->
-
-
- <!---
- Now that we have the tags stripped out and sepparated,
- lets break the remaining content based on the word
- boundry. This will give us all the text tokens.
- --->
- <cfset LOCAL.Tokens = LOCAL.Buffer.ToString().Split( "\b" ) />
-
-
- <!---
- Our tokens array is going to have a lot junk in it
- that are not words. Let's loop through the text and
- find the word information (which we will store in
- our words array).
- --->
- <cfloop
- index="LOCAL.Index"
- from="1"
- to="#ArrayLen( LOCAL.Tokens )#"
- step="1">
-
-
- <!---
- Check to see if this token contains valid
- characters (word characters).
- --->
- <cfif REFind( "\w", LOCAL.Tokens[ LOCAL.Index ] )>
-
- <!---
- Create a word object to store the words
- information.
- --->
- <cfset LOCAL.Word = StructNew() />
-
- <!--- Set the word data and the indexes. --->
- <cfset LOCAL.Word.HTML = LOCAL.Tokens[ LOCAL.Index ] />
- <cfset LOCAL.Word.Index = LOCAL.Index />
-
- <!--- Add this word to our word array. --->
- <cfset ArrayAppend(
- LOCAL.Words,
- LOCAL.Word
- ) />
-
- </cfif>
-
- </cfloop>
-
-
- <!---
- ASSERT: At this point, all the words that we can
- possibly alter are stored in the Words array. Now,
- we have to figure out which ones to alter.
- --->
-
-
- <!---
- Get the number of words that we are going to
- alter. This is going to be a random number from
- 10-15. However, we cannot pick more words than
- we have available.
- --->
- <cfset LOCAL.WordAlterCount = Min(
- RandRange( 10, 15 ),
- LOCAL.Words.Size()
- ) />
-
-
- <!---
- Create a struct to hold the indexes of the words
- that we are going to alter. We are creating a
- struct to help ensure that we don't pick
- duplicate words.
- --->
- <cfset LOCAL.WordIndexes = StructNew() />
-
-
- <!---
- Keep selecting a random word until our struct is
- as big as our words-to-alter count.
- --->
- <cfloop
- condition="(StructCount( LOCAL.WordIndexes ) LT LOCAL.WordAlterCount)">
-
- <!---
- Add a random index to the index struct. We don't
- care what the value is that we assign - we are
- only interested in the index-key.
- --->
- <cfset LOCAL.WordIndexes[ RandRange( 1, LOCAL.Words.Size() ) ] = true />
-
- </cfloop>
-
-
- <!---
- Now that we know the indexes of the words that we
- are going to alter, let's loop over them and randomly
- alter the words. Remember, since we stored the indexes
- as the keys in the index-struct, we can perform a
- collection loop (not an index one).
- --->
- <cfloop
- item="LOCAL.Index"
- collection="#LOCAL.WordIndexes#">
-
- <!--- Get the word that we are going to alter. --->
- <cfset LOCAL.Word = LOCAL.Words[ LOCAL.Index ].HTML />
-
- <!---
- Get the number of letters that we are going to
- randomly alter. This is between 1 and 3, but of
- course, no more than we actually have in the word.
- --->
- <cfset LOCAL.LetterAlterCount = Min(
- RandRange( 1, 3 ),
- LOCAL.Word.Length()
- ) />
-
-
- <!---
- Convert the word to a character array so that we
- can more easily pick and alter a random character.
- --->
- <cfset LOCAL.Word = LOCAL.Word.ToCharArray() />
-
-
- <!---
- Loop over the alter letter count to randomly
- alter the letter. NOTE: The way we are doing this,
- we may alter the same letter more than once. While
- this might not exactly do what we want it to do,
- it will still satisfy the 1-3 alter count. I figure
- since it cuts down on the work we have to do, then
- it is an acceptable *bug*.
- --->
- <cfloop
- index="LOCAL.CharIndex"
- from="1"
- to="#LOCAL.LetterAlterCount#"
- step="1">
-
- <!--- Get a random character to use. --->
- <cfset LOCAL.NewChar = LOCAL.CharSet.CharAt(
- JavaCast(
- "int",
- RandRange( 0, LOCAL.CharSet.Length() - 1)
- )
- ) />
-
-
- <!--- Get a random index to alter. --->
- <cfset LOCAL.IndexToAlter = RandRange(
- 0,
- (ArrayLen( LOCAL.Word ) - 1)
- ) />
-
-
- <!--- User reflection to alter the charater array. --->
- <cfset LOCAL.ReflectArray.Set(
- LOCAL.Word,
- LOCAL.IndexToAlter,
- ToString( LOCAL.NewChar ).CharAt( 0 )
- ) />
-
- </cfloop>
-
-
- <!---
- Store the altered word back into the word value at
- the appropriate index. In order to do this, we are
- going to convert the character array back into a
- "list" with no delimiters. This will make ColdFusion
- convert the char objects back to strings.
-
- Instead of storing the value back into our word
- array, let's skip the middle man and just store it
- back into our tokens array.
- --->
- <cfset LOCAL.ReflectArray.Set(
- LOCAL.Tokens,
- JavaCast(
- "int",
- (LOCAL.Words[ LOCAL.Index ].Index - 1 )
- ),
- ArrayToList( LOCAL.Word, "" )
- ) />
-
- </cfloop>
-
-
- <!---
- ASSERT: At this point, we have now randomly altered
- all of our words. Those words have been stored back
- into the token array.
- --->
-
-
- <!---
- Now that we have ranomly altered our words, let's merge
- all of the words back into a buffer so that we can
- then merge the tags back in.
- --->
- <cfset LOCAL.Buffer = CreateObject( "java", "java.lang.StringBuffer" ).Init() />
-
-
- <!--- Loop over the tokens. --->
- <cfloop
- index="LOCAL.Index"
- from="1"
- to="#ArrayLen( LOCAL.Tokens )#"
- step="1">
-
- <!---
- Add token to the buffer. Since the original buffer
- was split on word boundaries, we don't have to
- worry about putting anything additional back into
- the buffer (word boundaries are mete-data, not
- character data).
- --->
- <cfset LOCAL.Buffer.Append(
- LOCAL.Tokens[ LOCAL.Index ]
- ) />
-
- </cfloop>
-
-
- <!---
- Now, loop over the tags and replace those back into
- the buffer.
- --->
- <cfloop
- index="LOCAL.Index"
- from="1"
- to="#ArrayLen( LOCAL.Tags )#"
- step="1">
-
- <!--- Get a shorthand for the tag. --->
- <cfset LOCAL.Tag = LOCAL.Tags[ LOCAL.Index ] />
-
- <!--- Replace the tag back into the buffer. --->
- <cfset LOCAL.Buffer.Replace(
- JavaCast( "int", LOCAL.Tag.Start ),
- JavaCast( "int", LOCAL.Tag.End ),
- LOCAL.Tag.HTML
- ) />
-
- </cfloop>
-
-
- <!---
- ASSERT: At this point, our buffer contains the updated
- content that we are using to mess with bob. Now, all
- we have to do is return the updated text.
- --->
-
-
- <!---
- Although we don't *need* to do this, as we have already
- updated the content buffer, return the buffer string.
- --->
- <cfreturn LOCAL.Buffer.ToString() />
- </cffunction>
This solution is CRAZY complicated. It uses Java array reflection, string buffers, and lots of other stuff that is neat. If you can come up with a smaller, more elegant solution, I would love to know. My guess is, this is way more complicated than it as to be. But, as far as I can tell, it will ONLY alter the given words an absolutely nothing else in the content.
For extra brownie points, notice that the Function defaults the passed-in text to the current content buffer ;)
Reader Comments
Jesus. Do you accept my apology for this being WAY more than 5 minutes?? ;)
@Ray,
Hey man, it was a blast. And just cause it took me a long time doesn't mean that it is the correct solution by any stretch of the imagination. Plus, it was super excellent practice with some Java methods.
Plus, I know for a fact that my solution doesn't work for mid-word punctuation. For instance, since I am splitting on "\b", it treats "Ray's" as three words: "Ray" and "'" and "s". I kept trying to make the RegEx for the split better, but nothing seemed to work.... but then, the more I thought about it, I came to peace with it. I felt that it would be too hard to tell which punctuations were which. For instance, Ray's might be one word, but is "hard-hitting" one word or two? Because punctuation feels like such an "English" language construct, I didn't feel like tearing my hair out trying to parse things appropriately.
But regardless, it was a lot of Fun :D



