Using ColdFusion To Stream Files To The Client Without Loading The Entire File Into Memory
Posted May 14, 2008 at 3:24 PM by Ben Nadel
Just a really quick post here. In my previous post on creating semi-secure file downloads, Todd Rafferty brought up the idea that the reason we want to avoid CFContent is because it loads the entire file into memory before it flushes it to the browser. He made the point that the problem is not so much the tying up of threads via CFContent, but rather the fact that so much RAM was tied up in the file load. Therefore, he raised the idea of using a FileInputStream to incrementally load the file into memory and then flush it to the browser. He pointed me to a post on RealityStorm.com, which is what he was basing his code on.
Anyway, I have done some stuff like that a long time ago based on what Christian Cantrell wrote (which is what RealityStorm.com was also referencing); but, I haven't played around with it in a while (and mine didn't use an input stream, but rather a binary variable). And so, I thought I would quickly take the code and port it over to a ColdFusion custom tag - smartcfcontent.cfm. This doesn't support all of the ColdFusion CFContent features, just the File and Type combination. For everything else, you would just want to use the CFContent tag directly.
The use of this ColdFusion tag (in CFModule format) would look like this:
- <!---
- Even when using the "smart" buffer, we can still use our
- standard ColdFusion header values.
- --->
- <cfheader
- name="content-disposition"
- value="attachment; filename='girls.png'"
- />
-
- <!---
- Stream file to browser without having to load the entire
- file into memory. This uses a 5 meg buffer to shuttle data
- from a file to the client.
- --->
- <cfmodule
- template="smartcfcontent.cfm"
- type="image/png"
- file="#ExpandPath( './girls.png' )#"
- />
Notice that you can still use the standard ColdFusion CFHeader tag to define your attachment type and suggested file name.
Here is the ColdFusion code behind this custom tag. Nothing revolutionary here; I am really just duplicating what others have done, but in my own style so that me and Todd Rafferty can compare notes:
- <!--- Param the tag attributes. --->
-
-
- <!---
- This is the mime type of the content that we are
- streaming to the browser.
- --->
- <cfparam
- name="ATTRIBUTES.Type"
- type="string"
- default="application/octet-stream"
- />
-
- <!---
- This it the expanded path of the file that will be
- streamed to the client.
- --->
- <cfparam
- name="ATTRIBUTES.File"
- type="string"
- />
-
-
- <!---
- Get a pointer to the response. We will need to this to
- set the header values and finalize the data flush. To get
- this, we will have to go two levels deep - past the text
- output stream, to it's underlying binary stream.
- --->
- <cfset THISTAG.Response = GetPageContext()
- .GetResponse()
- .GetResponse()
- />
-
-
- <!---
- Get a pointer to the underlying binary repsonse stream
- of the current ColdFusions.
- --->
- <cfset THISTAG.BinaryOutputStream = THISTAG.Response.GetOutputStream() />
-
-
- <!---
- We need to create a byte array that will be used to read
- in the input stream and then transfer the input stream to
- the output stream. Since ColdFusion doesn't have true
- arrays, we need to hack one by grabbing the byte array
- from a ColdFusion string.
-
- Here, we are using the underlying Java method to grab a
- byte array that is 5,120 bytes long (around 5 megs).
- --->
- <cfset THISTAG.ByteBuffer = RepeatString( "12345", 1024 )
- .GetBytes()
- />
-
-
- <!---
- Now, we need to create a file input stream so that we can
- read chunks of the file into memory as we stream it.
- --->
- <cfset THISTAG.FileInputStream = CreateObject(
- "java",
- "java.io.FileInputStream"
- ).Init(
- JavaCast( "string", ATTRIBUTES.File )
- )
- />
-
-
- <!---
- Before we start putting stuff in the buffer, let's
- turn off the auto-flushing mechanism so that we have
- full control.
- --->
- <cfset GetPageContext().SetFlushOutput(
- JavaCast( "boolean", false )
- ) />
-
-
- <!---
- Reset the buffer to make sure nothing else has built up
- in prior to this tag.
- --->
- <cfset THISTAG.Response.ResetBuffer() />
-
-
- <!---
- Set the content type using the mime type that was passed
- in. This will give the browser information as to how to
- deal with the streamed content.
- --->
- <cfset THISTAG.Response.SetContentType(
- JavaCast( "string", ATTRIBUTES.Type )
- ) />
-
-
- <!---
- Now that we have all the elements in place, let's start
- reading in the file and moving it to the output buffer.
- We are going to keep doing this while until we hit the
- end of the file.
- --->
- <cfloop condition="true">
-
- <!--- Read a chunk of the file into the byte buffer. --->
- <cfset THISTAG.BytesRead = THISTAG.FileInputStream.Read(
- THISTAG.ByteBuffer,
- JavaCast( "int", 0 ),
- JavaCast( "int", ArrayLen( THISTAG.ByteBuffer ) )
- ) />
-
-
- <!---
- Check to see if any bytes were read. If not, then we
- will have a -1 to denote that the end of the file has
- been reached.
- --->
- <cfif (THISTAG.BytesRead NEQ -1)>
-
- <!---
- Write the buffer to the output stream. We want to be
- careful only to write as many bytes as were read in.
- --->
- <cfset THISTAG.BinaryOutputStream.Write(
- THISTAG.ByteBuffer,
- JavaCast( "int", 0 ),
- JavaCast( "int", THISTAG.BytesRead )
- ) />
-
- <!--- Flush this new content to the client. --->
- <cfset THISTAG.BinaryOutputStream.Flush() />
-
- <cfelse>
-
- <!---
- We hit a (-1). We reached the end of the file. This
- is not the cleanest solution, but just break out
- of the loop.
- --->
- <cfbreak />
-
- </cfif>
-
- </cfloop>
-
-
- <!---
- ASSERT: At this point, we have fully read in the file,
- moved it to the binary output stream, and then flushed it
- to the client. Now, we just have to peform clean up work.
- --->
-
-
- <!---
- Reset the response. This will clear any remaining information
- in the buffer as well as any header information.
- --->
- <cfset THISTAG.Response.Reset() />
-
- <!---
- Close the file input stream to make sure we are not locking
- the file from further use.
- --->
- <cfset THISTAG.FileInputStream.Close() />
-
- <!---
- Close the output stream to make sure no other content is
- getting flushed to the browser.
- --->
- <cfset THISTAG.BinaryOutputStream.Close() />
-
-
- <!---
- Exit out of this tag to make sure it doesn't try to execute
- for a second time if someone made it self-closing.
- --->
- <cfexit method="exittag" />
This works quite nicely.
Reader Comments
Good job Ben. I'll have to compare it against what I have at home. I'd love to see some memory stats on this as well and see if it really does what we think it is doing.
P.S.: CFDev Team @ Adobe, you're not off the hook on this - I want CFContent fixed. :P
btw, should we inquire as to where you found a 5 meg 'girls.png' file? ;)
@Todd: Are you saying that <cfcontent file="/path/to/file"> internally fails to use a file buffer, and instead reads the entire file before transmitting it to the browser (like as if you did <cffile action="read"><cfoutput>#filecontent#</cfoutput>)?
If that's the case, this seems like an egregious bug with ColdFusion, of what real value is <cfcontent file=...>? Although the docs don't say so explicitly I always assumed that was its purpose.
@Eric : Yes, to the best of my knowledge, that's what I'm saying. It's been this way for a long time too. Someone from Adobe can come correct me at any time.
Having built 2 iterations of a document library, I have seen some strange things happening with memory when it comes to cfcontent. I have brought this up elsewhere and seen a lot of "ditto" responses that have lead me to believe this. Don't get me wrong, CFContent is doing what it is supposed to be doing, streaming a file down to the user. It's just not doing what I'd like it to be doing, buffering. Perhaps there are additional threading involved when it comes to buffering, which is why I'd love to see stats on this.
So, someone can also feel free to step in and tell me why buffering would be bad in this scenario and why reading the whole thing would be more desirable. The only thing I can think of is cfcontent's deleteFile attribute. If you say yes, then it makes sense to read the whole thing into memory and delete the file. However, if I have no need to remove that file, then can't it be a little more memory friendly?
can anyone confirm that the behavior being presented is still present is cf8? remember that the file libraries got a complete overhaul.
this might be something to also pass a long to the open blue dragon committee.
I'm setting up a test scenario now, will use FusionReactor to test if memory usage is over the top
@Ben: Cool idea to wrap this up in a custom tag! :-) I'm excited to try this out.
I just did a test in CF 8 using FusionReactor to watch memory while using <cfcontent type="application/x-zip-compressed" file="#ExpandPath('bigfile.zip')#">
Memory was at 141mb when the file started, memory was at 151 when the file transfer completed. The file was 302 meg, and the transfer took about 2 minutes. The memory graph never spiked, and never went over 151 meg, though it did clearly do a garbage collection in the middle of the transfer and drop down to 141 meg before rising gradually to 151 again.
Long story short: it looks like <cfcontent file="..."> uses buffering to transfer the file, it doesn't consume an inappropriate level of resources.
Okeedoke, I guess I stand corrected.
@Eric,
Awesome detective work! This is most excellent to see.
@Todd,
As long as we are in a better place than we were a few hours ago, I have to think this foray has been pretty successful :)
Meh. If anything I learned I need to get off my ass and look into FusionReactor to back up my claims or debunk myths. :)
Oh wait, that's why I didn't look into it. *cough*$299*cough*.
Ha ha :)
FusionReactor is just downright invaluable. It kicks the butt of CF's built-in server monitoring. It's so great to be able to look at your past requests, see how many queries they executed, see the longest-running queries, see how much memory that request consumed, identify scripts which are consistently slow-running, see the request/response headers after the fact, kill threads which are stuck for whatever reason, set yourself up for alerts, see what your memory/cpu/jdbc utilization is like over time, and if you buy the enterprise version, get a dashboard of the health of all your servers.
One of my favorite features is the ability to add traces to a request. These are bits of extra data which get stored in the request history, so you can click on that request and get bits of debug info (whatever you feel like putting in there). Where I work, we do a lot of interaction with SAP from ColdFusion. I set up for myself the ability to have it trace the inputs and outputs of each SAP call as a flag on our SAP interaction CFC. Then when a business user places an order and claims they're seeing the wrong thing, I can go into the request history, click on the request, and see what I passed to SAP to make sure my inputs are right, and also see what came back from SAP to see if the problem is on that side.
If you're doing personal development or small-scale development, it's probably not worth it, but if you're doing enterprise stuff, this tool is simply critical. It's saved me so much debug time when I can just peek at the request history to narrow down where problems are.
It's also critical for monitoring production instances. We have a big pile of production boxes, some of which have many instances of CF running on them. If we get a complaint about slowness, we can fire up the enterprise dashboard and see at a glance exactly where the problem is - whether it's a specific server, whether it's the database, whether one of the boxes has run out of memory, etc.
Download the trial and look up how to wrap your data sources with the FusionReactor wrapper, as well as how to do FRAPI traces. My guess is you'll be impressed.
@Eric, I don't doubt it. If I were in business for myself, I would be purchasing it. Since I'm only poking around my own server and such, it's a little steep for me (for the moment). I will say that I might suggest a copy of it for work, but I'm not sure they'll go for it (small shop).
@Todd - Yeah, it would be nice if they provided a developer-only version for free (such as if it only worked in Developer-mode CF installs) - they'd probably make a lot of sales that way =)
@eric and todd,
you do know that you can save the receipt and write the purchase off at the end of the year.
@Tony: If you're running your own business, it'd be the smart thing to do. I don't run my own business.
@Eric,
Have you tested this on CF 7? I'm curious to know if cfcontent streams on CF 8 only.
@Kurt:
I hadn't tested it on 7 when I posted earlier. I'm running a test now. 25 meg into the file, memory is actually down slightly from when I started - a garbage collection clearly ran (started at 111 mb, down to 104 mb now). It looks like CF7 also buffers.
It should be noted, I'm looking at used memory, not allocated memory. The difference is allocated memory is how much RAM the JVM has claimed for itself while used memory is how much of the allocated memory is in use by resident objects (it's always smaller than allocated memory). This is the real measure of actual usage, and has to be reported by the JVM, not by the operating system (FusionReactor shows you both).
@Eric,
Thanks for the testing. Awesome stuff.
Hi,
Great to hear some feedback on FusionReactor. Our team is very active in the community and love to hear feedback (negative & positive). I'll certainly pass on your comments.
You can see an online demo of FusionReactor from our website but the best way to see it is download and install it (time limited demo version, don't worry, it just switches off when your time limit expires or, you can buy a license) - I'm sure you'll be impressed and see what a valuable tool it is.
If you're heading to Scotch-on-the-Rocks (UK) or CFUnited (US) then come along and meet some of the team to see FusionReactor and the other tools in our Fusion product suite.
Thanks,
D
Great code! I've modified it to stream large BLOBs from a database and it's working beautifully. Thanks Ben!
Ran into an interesting problem though - it looks like the response reset
- <cfset THISTAG.Response.Reset() />
is throwing java.lang.IllegalStateException: Response has already been committed.
The exception is not visible to the user but it of course shows up in the exception log.
Has anyone else run into this?
@Mike, that's expected behavior. If you've used <cfcontent> to send content to the browser, that is not buffered for memory reasons (there is no practical limit to how much data you could stream down with this approach). The response being committed means that ColdFusion has already instructed the web server to start sending bytes to the browser. Once they've been sent down the line, you can't recall them.
@Mike,
This has been a while, but I don't remember getting any errors. As @Eric is saying, however, if you've flushed any content to the browser already, then you won't be able to reset. It't be like trying to call CFContent twice in a row.
Do you have a CFFlush or something before the tag that's "committing" the request?
Hey Ben, not sure if you knew this or not but CF9 has a nasty bug of not being able to serve large files with CFContent (http://cfbugs.adobe.com/cfbugreport/flexbugui/cfbugtracker/main.html#bugId=83425).
I thought I had found a miracle work around with this posting, but it seems that using smartcfcontent, all files over 127 MB get truncated right at that point. Don't know if this has always happened, or if it's new and related to the CF9 CFContent bug, but that file size seems to be the same as where CFContent can't serve it anymore. Playing with the JVM memory settings doesn't seem to make any difference.
@Eric,
Hmmm, interesting behavior. I haven't served up large files before so have not run into this problem myself. I hope this gets solved in one of the dot-releases.
We never experienced cfcontent issues until we moved to CF9. I get the same errors using the ahove method as I do with cfcontent "Java heap space null". Some of the files on our file server can be as big as 750mb. Does anyone know if there is a way to make the max heap setting higher than 1024mb? If we make it higher, the CF service won't start back up. Thanks!
@Chad, yes, but only under a 64-bit OS. if you're running 32-bit, your limit is 1024. If you upgrade to 64-bit, you need to do a fresh install of CF - don't even just install overtop of your existing CF install or else CF will still be running in 32-bit mode.
Hey!
I'm using your code to read a file that is still being written, so I can begin the stream to the client before needing to wait for the file processing to finish.
Your code as is works great, but if the download speed exceeds the processing speed it will fail. Since I also write a log file when the processing is complete, I added a check for the log file's existence into the loop instead of looking for no new bytes, with a sleep to allow a little more processing to buffer if the log file's not present.
However, when all is said and done, there seems to be a lock on the file once the write is complete. I can still read the file, but can't delete it.
Do you think that since I open the file for reading while its still being written, the program writing it can't close its write handle? I don't really know how the handles are, er, handled as the file is being written by a pile of programs via pipes.. ffmpeg | sox | sox | lame | ffmpeg.
The fact that it works at all is actually outstanding. :D
Jay
@Jay, I don't know of any reason that reading a file will screw up locks on the same file by other processes. But it sounds like you're using a file as an intermediary between stream processing software (eg, ffmpeg) and ColdFusion. I haven't done this directly, but as far as I'm aware those programs can write their raw data to StdOut. If the data is discarded as soon as you stream it to the browser, you should be able to take advantage of that and send the data directly to the browser without it ever sitting on the disk.
You probably have to drop to Java processes, but you can do so entirely from the comfort of a ColdFusion environment. I've done real time command execution back to the browser a few years ago, what I have is probably not too far off from what you need. I talk about it on my long-extinct blog:
http://www.bandeblog.com/2008/03/real-time-command-execution-feedback/
When dealing with binary data, the hairy bits might be surrounding WriteOutput() calls, I'm not sure if you can do that with binary data or not, or how CF will even handle the data at this point. To do it right you need to start using buffered readers and such (the approach there would be a bit slow for really big chunks of data, such as ffmpeg tends to produce).
There's a way to do this, I've just never tried (though you've got me curious).
Interesting, I'll go through that code and see if there are any bits I can use, thank you!! I need to write the output of this process to a file for re-use since its very processor intensive. I haven't had a chance to really work at the problem yet, but I'll post back when I do.
On a side note, after many iterations of executing my processes with java based methods, I fell back to crappy ol' cfexecute to, of all things, a batch file. FFMpeg has an unusual problem in that all console output seems to be written to stdErr, which really sucks. I execute cmd.exe, call a batch file with all the programs I want to run afterwards, like:
route.bat c:\tools\ffmpeg.exe -i blah.flv -f wav - | lame.... etc
The batch file simply has:
echo %* <-- which is awesome
echo %errorlevel%
This allows me to look for the errorlevel in the cfexecute output (always in stdOut). I don't think errorLevel is otherwise available to cfexecute. That %* was new to me today.. it runs everything passed to it, getting over the 9 item variable limit in windoze command line.
K, too far off topic, just thought ppl here might interested. :D
Jay
@Jay, @Eric,
I didn't even know that files could be both read from and written to at the same time. Very interesting.
So, I found the problem with locked files, not being able to delete them after using this code. This code:
<cfset THISTAG.Response.Reset() />
Generates this error in application.log:
"Error","jrpp-2","11/15/10","16:50:42",,"Response has already been committed The specific sequence of files included or processed is: <snip>\stream.cfm, line: 176 "
Thus, the file was simply not being closed. Hope this helps someone out there in internet land.
I tried the tag on Windows 2008 IIS 7, works GREAT for anything under 1GB download size. For larger files, it stops downloading right at 1GB. Any idea why? I've been pulling my hair out on this for a day now trying all of the usual searches for IIS 7 download limits. The problem happens for me in both Chrome and IE browsers so I'm fairly sure it's not a browser problem. Any ideas anybody?
@Jay, @Ben,
I have encountered the same error: "Response has already been committed". I'm not doing anything else with the file other than downloading to the client so the file being locked isn't a problem for me, but my log file is getting filled with this error message.
I commented out the offending line
<cfset THISTAG.Response.Reset() />
and everything still seems to work properly. Is it OK to remove this line?
@Ran
Did you ever find out the answer to this?
Yes, it works just fine for me.
@Will,
I haven't received a response, but the code seems to work just fine with that line omitted.
FWIW this appears to be fixed in CF901 hotfix 2. I have not verified yet as I don't run the affected server here at work (though I asked for them to install hotfix when they can).
Ack! I was referring to the Coldfusion bug 83425.



