Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at cf.Objective() 2011 (Minneapolis, MN) with:

Escaping Form Values - Understanding The ColdFusion htmlEditFormat() Life Cycle

By Ben Nadel on
Tags: ColdFusion

When users can interact with your website by submitting content, it opens your site up to potential harm. In the worst case scenario, this might provide malicious hackers with a way to execute Cross-Site Scripting (XSS) attacks; in the best "bad" case scenario, it might simply mess up your site's layout and lead to a poor user experience. All of these outcomes can be avoided if you understand how data output gets rendered, where it needs to be escaped, and how to escape it.

When it comes to escaping output, there's no "one size fits all" solution. Rendering needs to be done on a per-use-case basis; if you escape everything, outputting meaningful content becomes impossible (imagine if all of your P tags were escaped); if you escape nothing, you leave your site open to attack. In addition to output strategies, you also have to think about how you want to store your data. Do you want to store escaped data? Does it matter that a single-character quote gets stored as the 5-character string, """? Again, these are questions that you need to ask yourself as these situations present themselves.

ColdFusion provides a very easy way to escape output: htmlEditFormat(). The htmlEditFormat() function takes a string and converts all of the meaningful HTML elements within the string to their escaped counterparts:

  • < becomes &lt;
  • > becomes &gt;
  • & becomes &amp;
  • " becomes &quot;

NOTE: ColdFusion also provides xmlFormat() for creating XML-safe strings; this is like htmlEditFormat(), but it escapes a larger set of characters including high-ascii values.

To see how htmlEditFormat() can be used, let's start out with a bad situation and then takes steps to correct it. For the following demo, I have an Application.cfc ColdFusion framework component that sets up a query object for out data set:

Application.cfc

  • <cfcomponent
  • output="false"
  • hint="I define the application settings and event handlers.">
  •  
  • <!--- Define the application settings. --->
  • <cfset this.name = hash( getCurrentTemplatePath() ) />
  • <cfset this.applicationTimeout = createTimeSpan( 0, 0, 10, 0 ) />
  •  
  • <!--- Define the request settings. --->
  • <cfsetting requesttimeout="10" />
  •  
  •  
  • <cffunction
  • name="onApplicationStart"
  • access="public"
  • returntype="boolean"
  • output="false"
  • hint="I initialize the application.">
  •  
  • <!---
  • Define the query cache. For this demo, this is going
  • to act as our datatable.
  • --->
  • <cfset application.girls = queryNew(
  • "id, name, age",
  • "cf_sql_integer, cf_sql_varchar, cf_sql_integer"
  • ) />
  •  
  • <!---
  • Keep track of GUID (globally unique ID) for our
  • "datatable" of girls.
  • --->
  • <cfset application.guid = 0 />
  •  
  • <!--- Return true so the page can process. --->
  • <cfreturn true />
  • </cffunction>
  •  
  •  
  • <cffunction
  • name="onRequestStart"
  • access="public"
  • returntype="boolean"
  • output="false"
  • hint="I initialize the request.">
  •  
  • <!--- Check to see if we need to reset the application. --->
  • <cfif structKeyExists( url, "reset" )>
  •  
  • <!--- Manually reset application. --->
  • <cfset this.onApplicationStart() />
  •  
  • </cfif>
  •  
  • <!--- Return true so the page can process. --->
  • <cfreturn true />
  • </cffunction>
  •  
  • </cfcomponent>

Here, I am setting up the query object, "application.girls," to act as our make-shift database table. I have also created a guid value as our database's "auto incrementing" key.

In the first part of the demo, I have created a page that lists out the girl records and provides us with a form for adding new girls. This version of the page uses no output escaping at all:

Demo 1: No Output Escaping

  • <!--- Param form values. --->
  • <cfparam name="form.id" type="numeric" default="0" />
  • <cfparam name="form.name" type="string" default="" />
  • <cfparam name="form.age" type="string" default="" />
  • <cfparam name="form.submitted" type="boolean" default="false" />
  •  
  • <!--- Create an error collection. --->
  • <cfset errors = [] />
  •  
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  • <!---
  • Check to see if we have a URL-based ID for editing. If we have
  • an ID, we are going to query for the targetted girl and then
  • move her data into the form variables.
  • --->
  • <cfif structKeyExists( url, "id" )>
  •  
  • <!--- Query for the given girl. --->
  • <cfquery name="girl" dbtype="query">
  • SELECT
  • *
  • FROM
  • application.girls
  • WHERE
  • id = #val( url.id )#
  • </cfquery>
  •  
  • <!--- Store girl values into the form variables for editing. --->
  • <cfset form.id = val( girl.id ) />
  • <cfset form.name = girl.name />
  • <cfset form.age = girl.age />
  •  
  • </cfif>
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  •  
  • <!--- Check to see if the form has been submitted. --->
  • <cfif form.submitted>
  •  
  • <!--- Validate form data. --->
  • <cfif !len( form.name )>
  •  
  • <cfset arrayAppend( errors, "Please enter a name." ) />
  •  
  • </cfif>
  •  
  • <cfif !reFind( "^\d+$", form.age )>
  •  
  • <cfset arrayAppend( errors, "Please enter a valid age." ) />
  •  
  • </cfif>
  •  
  •  
  • <!--- Check to see if we have any errors. --->
  • <cfif !arrayLen( errors )>
  •  
  • <!--- Check to see if we have a ID for an existing girl. --->
  • <cfif form.id>
  •  
  • <!---
  • Since we are editing an exisitng girl, let's just
  • remove the record from the data table so that we
  • can re-add it. This just saves us from having to
  • do "update" actions in this demo (which is not easy
  • in ColdFusion query of queries).
  • --->
  • <cfquery name="application.girls" dbtype="query">
  • SELECT
  • *
  • FROM
  • application.girls
  • WHERE
  • id != #form.id#
  • </cfquery>
  •  
  • <cfelse>
  •  
  • <!---
  • We have a new girl, so let's create a new ID for
  • the data table record we're about to create.
  • --->
  • <cfset form.id = ++application.guid />
  •  
  • </cfif>
  •  
  • <!--- Add a row to the query for our new/updated girl. --->
  • <cfset queryAddRow( application.girls ) />
  •  
  • <!--- Set the record values. --->
  • <cfset application.girls[ "id" ][ application.girls.recordCount ] = javaCast( "int", form.id ) />
  • <cfset application.girls[ "name" ][ application.girls.recordCount ] = javaCast( "string", form.name ) />
  • <cfset application.girls[ "age" ][ application.girls.recordCount ] = javaCast( "int", form.age ) />
  •  
  • <!--- Rediret to this page to display the new record. --->
  • <cflocation
  • url="#cgi.script_name#"
  • addtoken="false"
  • />
  •  
  • </cfif>
  •  
  • </cfif>
  •  
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  •  
  • <cfoutput>
  •  
  • <!DOCTYPE html>
  • <html>
  • <head>
  • <title>ColdFusion's htmlEditFormat() Life Cycle</title>
  • </head>
  • <body>
  •  
  • <h1>
  • ColdFusion's htmlEditFormat() Life Cycle
  • </h1>
  •  
  •  
  • <!--- Check to see if there are any errors. --->
  • <cfif arrayLen( errors )>
  •  
  • <h3>
  • Please review the following:
  • </h3>
  •  
  • <ul>
  • <cfloop
  • index="error"
  • array="#errors#">
  •  
  • <li>#error#</li>
  •  
  • </cfloop>
  • </ul>
  •  
  • </cfif>
  •  
  •  
  • <form action="#cgi.script_name#" method="post">
  •  
  • <!--- Form submission flag. --->
  • <input type="hidden" name="submitted" value="true" />
  •  
  • <!--- Record ID for editing. --->
  • <input type="hidden" name="id" value="#form.id#" />
  •  
  •  
  • Name:
  • <input type="text" name="name" value="#form.name#" size="40" />
  •  
  • Age:
  • <input type="text" name="age" value="#form.age#" size="5" />
  •  
  • <input type="submit" value="Add Girl" />
  •  
  • </form>
  •  
  •  
  • <h2>
  • Girls
  • </h2>
  •  
  • <table border="1" cellpadding="5" cellspacing="3">
  • <tr>
  • <th>
  • ID
  • </th>
  • <th>
  • Name
  • </th>
  • <th>
  • Age
  • </th>
  • </tr>
  • <cfloop query="application.girls">
  •  
  • <tr>
  • <td>
  • <a href="#cgi.script_name#?id=#application.girls.id#"
  • >#application.girls.id#</a>
  • </td>
  • <td>
  • #application.girls.name#
  • </td>
  • <td>
  • #application.girls.age#
  • </td>
  • </tr>
  •  
  • </cfloop>
  • </table>
  •  
  • </body>
  • </html>
  •  
  • </cfoutput>

There's a good amount of code on this page, so here are the key take-aways:

  • The form submits back to itself (cgi.script_name).
  • The form values are output to the form inputs using #form.xzy# notation.
  • The query is output to the page without any escaping.

When we build our pages like this, we leave ourselves open to harm. One of the first things that you'll probably experience with this kind of output strategy is that using quotes in your form values will swiftly break your output. Imagine that I entered this as the girl's name:

 
 
 
 
 
 
ColdFusion's htmlEditFormat() Function Can Protect Your Site By Escaping Output. 
 
 
 

If I submit this form as-is, I will be taken back to the form with the message that the Age field is also required:

 
 
 
 
 
 
ColdFusion's htmlEditFormat() Function Can Protect Your Site By Escaping Output. 
 
 
 

While we can see that we have a form error, the bigger problem is that our original data value is completely corrupted. When the form re-rendered, the value, form.name, was output along with our HTML. The quote in the submitted form.name field ended up prematurely closing the input element's "value" attribute:

 
 
 
 
 
 
ColdFusion's htmlEditFormat() Function Can Protect Your Site By Escaping Output. 
 
 
 

As you can see, everything after the first quote in our field value becomes invalid (and ignored) HTML. This leaves us with only part of the value on our next form submission.

To get around this problem, the first step we can take is to use htmlEditFormat() in our input field rendering. In the following code, I am only going to show you the updated form markup as nothing else in the code has changed:

  • <form action="#cgi.script_name#" method="post">
  •  
  • <!--- Form submission flag. --->
  • <input type="hidden" name="submitted" value="true" />
  •  
  • <!--- Record ID for editing. --->
  • <input type="hidden" name="id" value="#form.id#" />
  •  
  •  
  • Name:
  • <input type="text" name="name" value="#htmlEditFormat( form.name )#" size="40" />
  •  
  • Age:
  • <input type="text" name="age" value="#htmlEditFormat( form.age )#" size="5" />
  •  
  • <input type="submit" value="Add Girl" />
  •  
  • </form>

Notice now that as I output the form values, I am passing each variable to the htmlEditFormat() function. This will take the quotes that we submitted with the form and replace them with their escaped counterpart: &quot;. With this new strategy in place, submitting the same form result in the following HTML markup:

 
 
 
 
 
 
ColdFusion's htmlEditFormat() Function Can Protect Your Site By Escaping Output. 
 
 
 

Notice now that the value attribute of our input field remains intact.

Using htmlEditFormat() in our form fields is a great way to allow HTML characters to be entered without breaking the input field markup; but, it only addresses half of the problem. When a form gets submitted with escaped HTML characters, the data that reaches the server arrives in a non-escaped format. As such, data stored in the database without any additional processing may contain valid HTML characters.

To demonstrate this, I'm gonna to go ahead and submit the previous form with a valid name and age. When I do this, I get the following page output:

 
 
 
 
 
 
ColdFusion's htmlEditFormat() Function Can Protect Your Site By Escaping Output. 
 
 
 

It looks as if everything worked properly. But, when we look at the page source behind this rendering, we see the following:

 
 
 
 
 
 
ColdFusion's htmlEditFormat() Function Can Protect Your Site By Escaping Output. 
 
 
 

As you can see, the quotes that we had escaped in the form (as &quot;) both went into the database and then came out of the record set as unescaped characters.

When the data only contains quotes, this isn't much of an issue. But, if the data contained other HTML markup, things can get very ugly very quick. To demonstrate, I'm going to try an put some Javascript directly into the Name field:

 
 
 
 
 
 
ColdFusion's htmlEditFormat() Function Can Protect Your Site By Escaping Output. 
 
 
 

Granted, we are escaping the form fields for form-based display; but, once submitted, the data goes into the database as unescaped values. As such, when the page is re-rendered (after form submission), I am presented with a malicious Javascript behavior:

 
 
 
 
 
 
ColdFusion's htmlEditFormat() Function Can Protect Your Site By Escaping Output. 
 
 
 

If Ashton Kutcher were here, he'd be all like, "You just got PUNKED!" Of course, if it were a hacker, he's probably be sending your users' cookie data (including login tokens) to his off-site database using an embedded IMG tag.

So what do we do about this? Well, there are two solutions to be considered: first, you can escape the data before it goes into the database. This could involve something simple like running htmlEditFormat() over your entire form collection before you execute your database INSERT statements. Of course, this might lead to double-escaping if the user goes to edit the record later on (since our edit form will be escaping already-escaped values). You can always remedy that secondary problem by unescaping the values before you render the edit form.

The second option would be to simply escape the content as you render the output. To do this, we need to wrap our variable evaluation in a call to the htmlEditFormat() function. To demonstrate this, I'm going to present the updated code for the entire demo, complete with htmlEditFormat() in the form as well as in the general output:

Demo 2: Input And Output Escaping

  • <!--- Param form values. --->
  • <cfparam name="form.id" type="numeric" default="0" />
  • <cfparam name="form.name" type="string" default="" />
  • <cfparam name="form.age" type="string" default="" />
  • <cfparam name="form.submitted" type="boolean" default="false" />
  •  
  • <!--- Create an error collection. --->
  • <cfset errors = [] />
  •  
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  • <!---
  • Check to see if we have a URL-based ID for editing. If we have
  • an ID, we are going to query for the targetted girl and then
  • move her data into the form variables.
  • --->
  • <cfif structKeyExists( url, "id" )>
  •  
  • <!--- Query for the given girl. --->
  • <cfquery name="girl" dbtype="query">
  • SELECT
  • *
  • FROM
  • application.girls
  • WHERE
  • id = #val( url.id )#
  • </cfquery>
  •  
  • <!--- Store girl values into the form variables for editing. --->
  • <cfset form.id = val( girl.id ) />
  • <cfset form.name = girl.name />
  • <cfset form.age = girl.age />
  •  
  • </cfif>
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  •  
  • <!--- Check to see if the form has been submitted. --->
  • <cfif form.submitted>
  •  
  • <!--- Validate form data. --->
  • <cfif !len( form.name )>
  •  
  • <cfset arrayAppend( errors, "Please enter a name." ) />
  •  
  • </cfif>
  •  
  • <cfif !reFind( "^\d+$", form.age )>
  •  
  • <cfset arrayAppend( errors, "Please enter a valid age." ) />
  •  
  • </cfif>
  •  
  •  
  • <!--- Check to see if we have any errors. --->
  • <cfif !arrayLen( errors )>
  •  
  • <!--- Check to see if we have a ID for an existing girl. --->
  • <cfif form.id>
  •  
  • <!---
  • Since we are editing an exisitng girl, let's just
  • remove the record from the data table so that we
  • can re-add it. This just saves us from having to
  • do "update" actions in this demo (which is not easy
  • in ColdFusion query of queries).
  • --->
  • <cfquery name="application.girls" dbtype="query">
  • SELECT
  • *
  • FROM
  • application.girls
  • WHERE
  • id != #form.id#
  • </cfquery>
  •  
  • <cfelse>
  •  
  • <!---
  • We have a new girl, so let's create a new ID for
  • the data table record we're about to create.
  • --->
  • <cfset form.id = ++application.guid />
  •  
  • </cfif>
  •  
  • <!--- Add a row to the query for our new/updated girl. --->
  • <cfset queryAddRow( application.girls ) />
  •  
  • <!--- Set the record values. --->
  • <cfset application.girls[ "id" ][ application.girls.recordCount ] = javaCast( "int", form.id ) />
  • <cfset application.girls[ "name" ][ application.girls.recordCount ] = javaCast( "string", form.name ) />
  • <cfset application.girls[ "age" ][ application.girls.recordCount ] = javaCast( "int", form.age ) />
  •  
  • <!--- Rediret to this page to display the new record. --->
  • <cflocation
  • url="#cgi.script_name#"
  • addtoken="false"
  • />
  •  
  • </cfif>
  •  
  • </cfif>
  •  
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  •  
  • <cfoutput>
  •  
  • <!DOCTYPE html>
  • <html>
  • <head>
  • <title>ColdFusion's htmlEditFormat() Life Cycle</title>
  • </head>
  • <body>
  •  
  • <h1>
  • ColdFusion's htmlEditFormat() Life Cycle
  • </h1>
  •  
  •  
  • <!--- Check to see if there are any errors. --->
  • <cfif arrayLen( errors )>
  •  
  • <h3>
  • Please review the following:
  • </h3>
  •  
  • <ul>
  • <cfloop
  • index="error"
  • array="#errors#">
  •  
  • <li>#error#</li>
  •  
  • </cfloop>
  • </ul>
  •  
  • </cfif>
  •  
  •  
  • <form action="#cgi.script_name#" method="post">
  •  
  • <!--- Form submission flag. --->
  • <input type="hidden" name="submitted" value="true" />
  •  
  • <!--- Record ID for editing. --->
  • <input type="hidden" name="id" value="#form.id#" />
  •  
  •  
  • Name:
  • <input type="text" name="name" value="#htmlEditFormat( form.name )#" size="40" />
  •  
  • Age:
  • <input type="text" name="age" value="#htmlEditFormat( form.age )#" size="5" />
  •  
  • <input type="submit" value="Add Girl" />
  •  
  • </form>
  •  
  •  
  • <h2>
  • Girls
  • </h2>
  •  
  • <table border="1" cellpadding="5" cellspacing="3">
  • <tr>
  • <th>
  • ID
  • </th>
  • <th>
  • Name
  • </th>
  • <th>
  • Age
  • </th>
  • </tr>
  • <cfloop query="application.girls">
  •  
  • <tr>
  • <td>
  • <a href="#cgi.script_name#?id=#application.girls.id#"
  • >#application.girls.id#</a>
  • </td>
  • <td>
  • #htmlEditFormat( application.girls.name )#
  • </td>
  • <td>
  • #application.girls.age#
  • </td>
  • </tr>
  •  
  • </cfloop>
  • </table>
  •  
  • </body>
  • </html>
  •  
  • </cfoutput>

As you can see here, I'm using the htmlEditFormat() function to both render the form fields as well as to render the data list. With these updates in place, the previous attempt to embed malicious HTML in the data will be thwarted:

 
 
 
 
 
 
ColdFusion's htmlEditFormat() Function Can Protect Your Site By Escaping Output. 
 
 
 

Now, you might look at this and think to yourself that we've solved the problem. And, in many cases, we have; of course, as I said before, there's no solution that's right for every situation. If we simply escape all HTML in our content, we deprive users of the ability to use meaningful markup like BOLD and ITALIC tags. In many cases, the best solution ends up being a combination of the two approaches.

ColdFusion's htmlEditFormat() function is awesome. The key to getting the most out of it, however, is in understanding the form-data-display life cycle and where HTML escaping needs to be used. I hope that this has shed some light on the matter and has given you, at least, something to consider.

As far as security goes, this is just the tip of the iceberg. If you want to know more about application security, I would highly recommend reading Jason Dean's blog and Pete Freitag's blog. These guys have a head for security and a passion for enforcing it.




Reader Comments

I've started moving my students from htmlEditFormat over to xmlFormat. As you said, it catches more characters, but it's also useful for pure XML and is faster to type. I haven't been able to come up with a compelling reason to stick with htmlEditFormat.

Reply to this Comment

@Rick,

You raise a good point. I am not sure I can think of a good reason to use htmlEditFormat() over xmlFormat(). I thought that I remembered the browser having trouble with the escaped apostrophes; but, I just tested it and it worked fine. I guess you're right.

Reply to this Comment

Great post Ben.

I agree with Rick. There really is no reason not to use XMLFormat() over HTMLEditFormat() and there is a situation where you actually need the protection of XMLFormat(), because it escapes single quotes.

When using untrusted input inside of HTML attribute values that are enclosed in single quotes, it would be possible to break out of the HTML context with a simple ' > (quote & greater than). So for those misfits who use single quotes for their HTML attribute values would not be protected if using HTMLEditFormat().

I will point out that in regular HTML blocks, or in HTML attribute values that are surrounded with double-quotes, HTMLEditFormat() should be just fine and there is no reason for anyone to rush out and try to replace all instances of HTMLEditFormat() with XMLFormat().

One last thing I will point out, because a lot of developers do not realize this, but neither function will protect you in any context other than within an HTML element block or within a Non-JS event handler HTML attribute value. Do NOT try to use these values in dynamic JavaScript or CSS. The will not help you. I will be posting more details on this on my blog in the near future.

Reply to this Comment

Nice post. I've been using HtmlEditFormat(), but will try out the XMLFormat() function. That might be helpful with working with those extra characters that usually get saved when user's paste from MS Word.

In my current project, we have a great deal of focus on protecting against XSS, but we also have a need for RTF, using CK Editor. My solution is to be extremely strict with the request input, and escape everything. Then for those elements which require RTF output, I use regular expressions to allow specific tags through (For example: , , <br>, <p>,....). Then I also use some of the CK Editor properties to help limit the tags on the client side.

This solution works great, but when I add other tags, or attributes, to the "allow" list, the code gets pretty nasty. Has anyone ever tried this sort of thing? If so, any suggestions to a prettier solution?

Reply to this Comment

@Greg,

It definitely gets tricky when you have to allow some formatting, but also need to protect against malicious code. I've seen allll kind of stuff done in presentations - some hacks are super clever.

I think you have a good approach there - escape everything and then unescape white-listed elements. The pattern matching just gets a bit fun since you're matching longer strings.

Reply to this Comment

Hi Ben,

Another post filled with gold. Thanks!

I agree that it's certainly important to filter everything when it's output on the page, and HtmlFormat and XmlFormat are both great options.

But, it's tough to guarantee that you.. I mean, uh, your co-workers... will remember to do this for every output on the site. So, I like to add input filters into my database code, so that everything is cleaned BEFORE going into the database -- in addition to sweeping all outputs for XmlFormat() calls.

It's sort of like the Spartans defending Greece in the movie "300" -- you can get a lot of leverage by finding a choke point that you filter all of the inputs through. So, I like to pass all variables through this filter first, and then be more certain that they're safe throughout the application.

Reply to this Comment

@Ben,

There's also a performance benefit to escaping on database insert since it only needs to be done ONCE - when inserting. When you escape on output, this needs to be done every time you output (assuming no caching). That can be a lot of string manipulation! And, as you say, you can sort of cover your bases by not relying on other people to make sure the code is written well.

Reply to this Comment

Hi Ben,

I usually also turn on scriptprotect in CF-machines. It ensures that script-tags are filtered as <invalidtag> when put into a database or displayed thru a form. Railo does this out of the box BTW.

xmlFormat() is a lot better than htmlEditFormat(), as you also stated in the comments. Bit what I really liked about this step-by-step explanation is that you actually wrote it up. Maybe also something for the CF Cookbook?

Reply to this Comment

Just to state the obvious: NEVER rely solely on scriptprotect in neither Railo or Adobe CF. ALWAYS sanitize your input ;-)

Reply to this Comment

As for escaping high ASCII, UTF-8 ought to handle that correctly, and the default output charset of ColdFusion is UTF-8.

But we ran into a ton of problems with people cutting and pasting data from Word documents, which aren't in UTF-8. They're in some Windows proprietary charset that refuses to die. We can't really tell people not to do it that way, because some of them are VIPs who expect us to be able to handle anything they throw at us, even if it's unreasonable.

So I came up with a BetterXMLFormat UDF that uses Numeric Entity Encoding (ampersand, poundsign, optional x for when number is hex, number in Unicode, semicolon) whenever the numeric value of the character in Unicode is greater than 126. Internally, CF strings are UTF-16, so the numeric value is Unicode, so Numeric Entity Encoding in decimal is really easy. Of course, BetterXMLFormat also escapes the big 5 low ASCII characters that XMLFormat escapes.

After doing that, I don't think it matters anymore whether you use the meta tag to tell the browser that the encoding is UTF-8. We're forcing all of our developers to say UTF-8 anyway, just because it's true, but if all of the characters on the page are low ASCII anyway, how could it possibly matter?

Yeah, it takes up more characters than UTF-8, but that generally happens only rarely (esp., Hispanic names), and we've never had a problem since.

Hope this helps.

Reply to this Comment

@Sebastiaan,

Yeah, from what I've seen, ScriptProtect has a lot of vulnerabilities. I believe you can also change the regular expression that is used by ColdFusion. Speaking of Foundeo, let me give a shout out to Pete's Hack my CF service:

http://hackmycf.com

... which helps you locate vulnerabilities in your ColdFusion servers.

@Steve,

UTF-8 is just an interesting journey :) I've run into serious problems where I had some Windows encoding in my Database table and my pages had UTF-8 output. I kept getting [?] kind of outputs. Of course, if I switched the database to UTF-8, suddenly, it would show up fine in the page, but not in the forms.... it took a lot of going back and forth before I finally got UTF-8 encoded across the board from database to page.

Also, pasting from MS Word is just evil :)

Reply to this Comment

How will cfqueryparam help in this aspect? As you are not protecting the INSERT statement, maybe cfqueryparam will escape and handle special chars?

Reply to this Comment

@Ken,

You should always be using CFQueryParam, no question. However, the problem here is not necessarily with the INSERT statements, but with the way the output of the stored data renders in the HTML. If I have double-quotes in my input value, there's nothing "malicious" about it as far as SQL goes; CFQueryParam won't do anything special to the value. It's the output of said value that can mess things up.

Reply to this Comment

Hey Ben!
Do you know of an easy way to go from escaped html characters (the result of htmlEditFormat) back to html?

&lt;p&gt;Hello World&lt;/p&gt;

to

<p>Hello World</p>

Reply to this Comment

Just thinking.... instead of allowing some HTML input characters wouldn't it be nice, if you could add markdown (http://daringfireball.net/projects/markdown/) to input fields and have some server magic outputting it into HTML markup?

So my bold text would be outputted **bold text** and I could disallow all <tags> altogether.

Reply to this Comment

Very useable blogpost.

I have one question though:
How would I use XMLFormat on input fields, whose default value I need to populate from a database. For example in an edit-user-account setting, I would pull user info from the database and display it in the input fields, for the user to edit.

But doing it like this does not work:

  • <input type="text" value="#XMLFormat( #somevalue# )#" name="username" />

because of the variable hashes and some value not being a form.field.

Is not possible to use XMLFormat in this (not out of the ordinary) situation?

Cheers!

Sven

Reply to this Comment

Post A Comment

You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.