Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at the New York ColdFusion User Group (Jul. 2009) with: Sean Schroeder

Which ASCII Characters Does urlEncodedFormat() Escape In ColdFusion

By Ben Nadel on
Tags: ColdFusion

urlEncodedFormat() is one of those functions that I've been using forever; but, when I stop and think about it, I'm not 100% sure what it actually does. I mean, I know that it prepares a value to be used in a URL; but I don't think I've ever actually read the documentation on it. And, I've definitely never experimented with it. As such, I thought I would do a little "note to self" blog post and see what actually happens when I apply urlEncodedFormat() to individual characters.

This experiment is simple - loop over each character, apply urlEncodedFormat(), and see if the resultant value is different. If so, it means that urlEncodedFormat() encoded the value.

  • <cfscript>
  •  
  • // NOTE: Only going between 32 and 126 because urlEncodedFormat() appears to
  • // encode all control characters as well as anything above 127 (inclusive).
  • for ( i = 32 ; i <= 126 ; i++ ) {
  •  
  • charValue = chr( i );
  • escapedValue = urlEncodedFormat( charValue, "utf-8" );
  •  
  • // If the two values don't match, it means that urlEncodedFormat() is
  • // escapeing the value.
  • if ( compare( charValue, escapedValue ) ) {
  •  
  • writeOutput( "#i# ... #charValue# ... #escapedValue#<br />" );
  •  
  • }
  •  
  • }
  •  
  • </cfscript>

I'm only looping from 32 to 126 because urlEncodedFormat() seems to encode all control characters (most of which are 0-31) and all characters on or above 127. So, for the sake of the demo, I've limited it to the area of the basic ASCII set where things are interesting.

When we run the above code, we get the following output:

32 ... ... %20
33 ... ! ... %21
34 ... " ... %22
35 ... # ... %23
36 ... $ ... %24
37 ... % ... %25
38 ... & ... %26
39 ... ' ... %27
40 ... ( ... %28
41 ... ) ... %29
42 ... * ... %2A
43 ... + ... %2B
44 ... , ... %2C
45 ... - ... %2D
46 ... . ... %2E
47 ... / ... %2F
58 ... : ... %3A
59 ... ; ... %3B
60 ... < ... %3C
61 ... = ... %3D
62 ... > ... %3E
63 ... ? ... %3F
64 ... @ ... %40
91 ... [ ... %5B
92 ... \ ... %5C
93 ... ] ... %5D
94 ... ^ ... %5E
95 ... _ ... %5F
96 ... ` ... %60
123 ... { ... %7B
124 ... | ... %7C
125 ... } ... %7D
126 ... ~ ... %7E

As you can see, urlEncodedFormat() escaped every non-alpha-numeric character. Which is, ironically, exactly what the documentation says:

Generates a URL-encoded string. For example, it replaces spaces with %20, and non-alphanumeric characters with equivalent hexadecimal escape sequences. Passes arbitrary strings within a URL (ColdFusion automatically decodes URL parameters that are passed to a page).

Ok - this all makes sense now. My mental model has been updated.




Reader Comments

@Sean,

Oooh, most excellent suggestion. I actually haven't played around with any of the new encoding methods. I think those are all based on OWASP standards; but, not sure. I'll take a look, thanks!

Reply to this Comment

Out of curiosity I used the above code and swapped out urlEncodedFormat( charValue, "utf-8" ) for encodeForUrl( charValue ) and the results were the same except for char(32). . .

Reply to this Comment

@Tony,

Awesome - thanks for doing that. I'm surprised that it doesn't replace the space with a "+".

Reply to this Comment

There's an extensive character comparison of old vs. new HTML, XML, URL and JS encoders here:

http://damonmiller.github.io/esapi4cf/tutorials/Encoding.html

Reply to this Comment

Post A Comment

You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.