Skip to main content
Ben Nadel at CFUNITED 2010 (Landsdown, VA) with: Dutch Rapley
Ben Nadel at CFUNITED 2010 (Landsdown, VA) with: Dutch Rapley ( @dutchrapley )

Which ASCII Characters Does urlEncodedFormat() Escape In ColdFusion

By on
Tags:

urlEncodedFormat() is one of those functions that I've been using forever; but, when I stop and think about it, I'm not 100% sure what it actually does. I mean, I know that it prepares a value to be used in a URL; but I don't think I've ever actually read the documentation on it. And, I've definitely never experimented with it. As such, I thought I would do a little "note to self" blog post and see what actually happens when I apply urlEncodedFormat() to individual characters.

This experiment is simple - loop over each character, apply urlEncodedFormat(), and see if the resultant value is different. If so, it means that urlEncodedFormat() encoded the value.

<cfscript>

	// NOTE: Only going between 32 and 126 because urlEncodedFormat() appears to
	// encode all control characters as well as anything above 127 (inclusive).
	for ( i = 32 ; i <= 126 ; i++ ) {

		charValue = chr( i );
		escapedValue = urlEncodedFormat( charValue, "utf-8" );

		// If the two values don't match, it means that urlEncodedFormat() is
		// escapeing the value.
		if ( compare( charValue, escapedValue ) ) {

			writeOutput( "#i# ... #charValue# ... #escapedValue#<br />" );

		}

	}

</cfscript>

I'm only looping from 32 to 126 because urlEncodedFormat() seems to encode all control characters (most of which are 0-31) and all characters on or above 127. So, for the sake of the demo, I've limited it to the area of the basic ASCII set where things are interesting.

When we run the above code, we get the following output:

32 ... ... %20
33 ... ! ... %21
34 ... " ... %22
35 ... # ... %23
36 ... $ ... %24
37 ... % ... %25
38 ... & ... %26
39 ... ' ... %27
40 ... ( ... %28
41 ... ) ... %29
42 ... * ... %2A
43 ... + ... %2B
44 ... , ... %2C
45 ... - ... %2D
46 ... . ... %2E
47 ... / ... %2F
58 ... : ... %3A
59 ... ; ... %3B
60 ... < ... %3C
61 ... = ... %3D
62 ... > ... %3E
63 ... ? ... %3F
64 ... @ ... %40
91 ... [ ... %5B
92 ... \ ... %5C
93 ... ] ... %5D
94 ... ^ ... %5E
95 ... _ ... %5F
96 ... ` ... %60
123 ... { ... %7B
124 ... | ... %7C
125 ... } ... %7D
126 ... ~ ... %7E

As you can see, urlEncodedFormat() escaped every non-alpha-numeric character. Which is, ironically, exactly what the documentation says:

Generates a URL-encoded string. For example, it replaces spaces with %20, and non-alphanumeric characters with equivalent hexadecimal escape sequences. Passes arbitrary strings within a URL (ColdFusion automatically decodes URL parameters that are passed to a page).

Ok - this all makes sense now. My mental model has been updated.

Want to use code from this post? Check out the license.

Reader Comments

15,663 Comments

@Sean,

Oooh, most excellent suggestion. I actually haven't played around with any of the new encoding methods. I think those are all based on OWASP standards; but, not sure. I'll take a look, thanks!

2 Comments

Out of curiosity I used the above code and swapped out urlEncodedFormat( charValue, "utf-8" ) for encodeForUrl( charValue ) and the results were the same except for char(32). . .

14 Comments

There's an extensive character comparison of old vs. new HTML, XML, URL and JS encoders here:

http://damonmiller.github.io/esapi4cf/tutorials/Encoding.html

I believe in love. I believe in compassion. I believe in human rights. I believe that we can afford to give more of these gifts to the world around us because it costs us nothing to be decent and kind and understanding. And, I want you to know that when you land on this site, you are accepted for who you are, no matter how you identify, what truths you live, or whatever kind of goofy shit makes you feel alive! Rock on with your bad self!
Ben Nadel