Skip to main content
Ben Nadel at Ellen's Stardust Diner (New York City) with: Colin Silverberg and Carol Loffelmann and Daniel Silverberg
Ben Nadel at Ellen's Stardust Diner (New York City) with: Colin Silverberg Carol Loffelmann ( @Mommy_md ) Daniel Silverberg

Embedding Foreign Characters In Your Content-Disposition Filename Header

Published in

Since English is my primary language, I sometimes don't realize that aspects of my web applications don't play nicely with non-US ASCII characters. Such is the case with the "Content-Disposition" header. I've been using it for years; but, only found out last week that the "filename" portion of the Content-Disposition header doesn't naturally handle non-US ASCII characters. Luckily, modern browsers support an extension to the Content-Disposition header that allows for UTF-8 encoded characters.

With some Googling, I came across this page, which has a ton of test cases for the Content-Disposition header. Among the tests, it suggests that you can use a special notation that allows for the standard filename plus a UTF-8 filename with URL-encoded characters.

	Query for the files in the directory.
	NOTE: I am doing this so I don't have to embed high-ascii characters
	in the code - I don't think my blog has the proper support for UTF-8
	encoding? Not sure.
<cfset files = directoryList( expandPath( "./" ), false, "name", "Data*" ) />

<!--- Isolate the file with foreign characters. --->
<cfset fileName = files[ 1 ] />
<cfset filePath = expandPath( fileName ) />

	By default, the filename portion of the Content-Disposition header
	only allows for US-ASCII values. In order to account for foreign /
	exnteded ASCII values, we have to jump through some funky notation.

	In this case, we are attempting to provide fallbacks. The first
	instance of "filename" is for browsers that do not support the RFC
	5987 encoding (they ignore the filename*= after the filename).
	Then, for browsers that DO support the encoding, they will pick
	up the UTF-8 encoding.

	Notice that the UTF-encoded value doesn't need to be quoted since
	the embeded spaces are url-encoded.
	value="attachment; filename=""#fileName#""; filename*=UTF-8''#urlEncodedFormat( fileName )#"

	type="text/plain; charset=utf-8"

In this case, the "filename*=UTF-8''" notation will be honored by modern browsers and ignored by older browsers, which will use the "filename" value as the fallback.

When I run the above code, I am prompted for a file download:

High-ascii values in the Content-Disposition header.

Notice that the foreign characters (French) are present. This seems to work on all my modern browsers, including the latest releases of IE.

Want to use code from this post? Check out the license.

Reader Comments

I believe in love. I believe in compassion. I believe in human rights. I believe that we can afford to give more of these gifts to the world around us because it costs us nothing to be decent and kind and understanding. And, I want you to know that when you land on this site, you are accepted for who you are, no matter how you identify, what truths you live, or whatever kind of goofy shit makes you feel alive! Rock on with your bad self!
Ben Nadel