Using ColdFusion To Stream Files To The Client Without Loading The Entire File Into Memory
Posted May 14, 2008 at 3:24 PM
Just a really quick post here. In my previous post on creating semi-secure file downloads, Todd Rafferty brought up the idea that the reason we want to avoid CFContent is because it loads the entire file into memory before it flushes it to the browser. He made the point that the problem is not so much the tying up of threads via CFContent, but rather the fact that so much RAM was tied up in the file load. Therefore, he raised the idea of using a FileInputStream to incrementally load the file into memory and then flush it to the browser. He pointed me to a post on RealityStorm.com, which is what he was basing his code on.
Anyway, I have done some stuff like that a long time ago based on what Christian Cantrell wrote (which is what RealityStorm.com was also referencing); but, I haven't played around with it in a while (and mine didn't use an input stream, but rather a binary variable). And so, I thought I would quickly take the code and port it over to a ColdFusion custom tag - smartcfcontent.cfm. This doesn't support all of the ColdFusion CFContent features, just the File and Type combination. For everything else, you would just want to use the CFContent tag directly.
The use of this ColdFusion tag (in CFModule format) would look like this:
Launch code in new window » Download code as text file »
- <!---
- Even when using the "smart" buffer, we can still use our
- standard ColdFusion header values.
- --->
- <cfheader
- name="content-disposition"
- value="attachment; filename='girls.png'"
- />
-
- <!---
- Stream file to browser without having to load the entire
- file into memory. This uses a 5 meg buffer to shuttle data
- from a file to the client.
- --->
- <cfmodule
- template="smartcfcontent.cfm"
- type="image/png"
- file="#ExpandPath( './girls.png' )#"
- />
Notice that you can still use the standard ColdFusion CFHeader tag to define your attachment type and suggested file name.
Here is the ColdFusion code behind this custom tag. Nothing revolutionary here; I am really just duplicating what others have done, but in my own style so that me and Todd Rafferty can compare notes:
Launch code in new window » Download code as text file »
- <!--- Param the tag attributes. --->
-
-
- <!---
- This is the mime type of the content that we are
- streaming to the browser.
- --->
- <cfparam
- name="ATTRIBUTES.Type"
- type="string"
- default="application/octet-stream"
- />
-
- <!---
- This it the expanded path of the file that will be
- streamed to the client.
- --->
- <cfparam
- name="ATTRIBUTES.File"
- type="string"
- />
-
-
- <!---
- Get a pointer to the response. We will need to this to
- set the header values and finalize the data flush. To get
- this, we will have to go two levels deep - past the text
- output stream, to it's underlying binary stream.
- --->
- <cfset THISTAG.Response = GetPageContext()
- .GetResponse()
- .GetResponse()
- />
-
-
- <!---
- Get a pointer to the underlying binary repsonse stream
- of the current ColdFusions.
- --->
- <cfset THISTAG.BinaryOutputStream = THISTAG.Response.GetOutputStream() />
-
-
- <!---
- We need to create a byte array that will be used to read
- in the input stream and then transfer the input stream to
- the output stream. Since ColdFusion doesn't have true
- arrays, we need to hack one by grabbing the byte array
- from a ColdFusion string.
-
- Here, we are using the underlying Java method to grab a
- byte array that is 5,120 bytes long (around 5 megs).
- --->
- <cfset THISTAG.ByteBuffer = RepeatString( "12345", 1024 )
- .GetBytes()
- />
-
-
- <!---
- Now, we need to create a file input stream so that we can
- read chunks of the file into memory as we stream it.
- --->
- <cfset THISTAG.FileInputStream = CreateObject(
- "java",
- "java.io.FileInputStream"
- ).Init(
- JavaCast( "string", ATTRIBUTES.File )
- )
- />
-
-
- <!---
- Before we start putting stuff in the buffer, let's
- turn off the auto-flushing mechanism so that we have
- full control.
- --->
- <cfset GetPageContext().SetFlushOutput(
- JavaCast( "boolean", false )
- ) />
-
-
- <!---
- Reset the buffer to make sure nothing else has built up
- in prior to this tag.
- --->
- <cfset THISTAG.Response.ResetBuffer() />
-
-
- <!---
- Set the content type using the mime type that was passed
- in. This will give the browser information as to how to
- deal with the streamed content.
- --->
- <cfset THISTAG.Response.SetContentType(
- JavaCast( "string", ATTRIBUTES.Type )
- ) />
-
-
- <!---
- Now that we have all the elements in place, let's start
- reading in the file and moving it to the output buffer.
- We are going to keep doing this while until we hit the
- end of the file.
- --->
- <cfloop condition="true">
-
- <!--- Read a chunk of the file into the byte buffer. --->
- <cfset THISTAG.BytesRead = THISTAG.FileInputStream.Read(
- THISTAG.ByteBuffer,
- JavaCast( "int", 0 ),
- JavaCast( "int", ArrayLen( THISTAG.ByteBuffer ) )
- ) />
-
-
- <!---
- Check to see if any bytes were read. If not, then we
- will have a -1 to denote that the end of the file has
- been reached.
- --->
- <cfif (THISTAG.BytesRead NEQ -1)>
-
- <!---
- Write the buffer to the output stream. We want to be
- careful only to write as many bytes as were read in.
- --->
- <cfset THISTAG.BinaryOutputStream.Write(
- THISTAG.ByteBuffer,
- JavaCast( "int", 0 ),
- JavaCast( "int", THISTAG.BytesRead )
- ) />
-
- <!--- Flush this new content to the client. --->
- <cfset THISTAG.BinaryOutputStream.Flush() />
-
- <cfelse>
-
- <!---
- We hit a (-1). We reached the end of the file. This
- is not the cleanest solution, but just break out
- of the loop.
- --->
- <cfbreak />
-
- </cfif>
-
- </cfloop>
-
-
- <!---
- ASSERT: At this point, we have fully read in the file,
- moved it to the binary output stream, and then flushed it
- to the client. Now, we just have to peform clean up work.
- --->
-
-
- <!---
- Reset the response. This will clear any remaining information
- in the buffer as well as any header information.
- --->
- <cfset THISTAG.Response.Reset() />
-
- <!---
- Close the file input stream to make sure we are not locking
- the file from further use.
- --->
- <cfset THISTAG.FileInputStream.Close() />
-
- <!---
- Close the output stream to make sure no other content is
- getting flushed to the browser.
- --->
- <cfset THISTAG.BinaryOutputStream.Close() />
-
-
- <!---
- Exit out of this tag to make sure it doesn't try to execute
- for a second time if someone made it self-closing.
- --->
- <cfexit method="exittag" />
This works quite nicely.
Download Code Snippet ZIP File
Post Comment | Ask Ben | Permalink | Other Searches | Print Page
Newer Post
Eric Stevens On CFContent And Memory Usage In ColdFusion 8
Older Post
Creating Semi-Secure File Downloads Without Using CFContent
Reader Comments
Good job Ben. I'll have to compare it against what I have at home. I'd love to see some memory stats on this as well and see if it really does what we think it is doing.
P.S.: CFDev Team @ Adobe, you're not off the hook on this - I want CFContent fixed. :P
btw, should we inquire as to where you found a 5 meg 'girls.png' file? ;)
@Todd: Are you saying that <cfcontent file="/path/to/file"> internally fails to use a file buffer, and instead reads the entire file before transmitting it to the browser (like as if you did <cffile action="read"><cfoutput>#filecontent#</cfoutput>)?
If that's the case, this seems like an egregious bug with ColdFusion, of what real value is <cfcontent file=...>? Although the docs don't say so explicitly I always assumed that was its purpose.
@Eric : Yes, to the best of my knowledge, that's what I'm saying. It's been this way for a long time too. Someone from Adobe can come correct me at any time.
Having built 2 iterations of a document library, I have seen some strange things happening with memory when it comes to cfcontent. I have brought this up elsewhere and seen a lot of "ditto" responses that have lead me to believe this. Don't get me wrong, CFContent is doing what it is supposed to be doing, streaming a file down to the user. It's just not doing what I'd like it to be doing, buffering. Perhaps there are additional threading involved when it comes to buffering, which is why I'd love to see stats on this.
So, someone can also feel free to step in and tell me why buffering would be bad in this scenario and why reading the whole thing would be more desirable. The only thing I can think of is cfcontent's deleteFile attribute. If you say yes, then it makes sense to read the whole thing into memory and delete the file. However, if I have no need to remove that file, then can't it be a little more memory friendly?
can anyone confirm that the behavior being presented is still present is cf8? remember that the file libraries got a complete overhaul.
this might be something to also pass a long to the open blue dragon committee.
I'm setting up a test scenario now, will use FusionReactor to test if memory usage is over the top
@Ben: Cool idea to wrap this up in a custom tag! :-) I'm excited to try this out.
I just did a test in CF 8 using FusionReactor to watch memory while using <cfcontent type="application/x-zip-compressed" file="#ExpandPath('bigfile.zip')#">
Memory was at 141mb when the file started, memory was at 151 when the file transfer completed. The file was 302 meg, and the transfer took about 2 minutes. The memory graph never spiked, and never went over 151 meg, though it did clearly do a garbage collection in the middle of the transfer and drop down to 141 meg before rising gradually to 151 again.
Long story short: it looks like <cfcontent file="..."> uses buffering to transfer the file, it doesn't consume an inappropriate level of resources.
Okeedoke, I guess I stand corrected.
@Eric,
Awesome detective work! This is most excellent to see.
@Todd,
As long as we are in a better place than we were a few hours ago, I have to think this foray has been pretty successful :)
Meh. If anything I learned I need to get off my ass and look into FusionReactor to back up my claims or debunk myths. :)
Oh wait, that's why I didn't look into it. *cough*$299*cough*.
Ha ha :)
FusionReactor is just downright invaluable. It kicks the butt of CF's built-in server monitoring. It's so great to be able to look at your past requests, see how many queries they executed, see the longest-running queries, see how much memory that request consumed, identify scripts which are consistently slow-running, see the request/response headers after the fact, kill threads which are stuck for whatever reason, set yourself up for alerts, see what your memory/cpu/jdbc utilization is like over time, and if you buy the enterprise version, get a dashboard of the health of all your servers.
One of my favorite features is the ability to add traces to a request. These are bits of extra data which get stored in the request history, so you can click on that request and get bits of debug info (whatever you feel like putting in there). Where I work, we do a lot of interaction with SAP from ColdFusion. I set up for myself the ability to have it trace the inputs and outputs of each SAP call as a flag on our SAP interaction CFC. Then when a business user places an order and claims they're seeing the wrong thing, I can go into the request history, click on the request, and see what I passed to SAP to make sure my inputs are right, and also see what came back from SAP to see if the problem is on that side.
If you're doing personal development or small-scale development, it's probably not worth it, but if you're doing enterprise stuff, this tool is simply critical. It's saved me so much debug time when I can just peek at the request history to narrow down where problems are.
It's also critical for monitoring production instances. We have a big pile of production boxes, some of which have many instances of CF running on them. If we get a complaint about slowness, we can fire up the enterprise dashboard and see at a glance exactly where the problem is - whether it's a specific server, whether it's the database, whether one of the boxes has run out of memory, etc.
Download the trial and look up how to wrap your data sources with the FusionReactor wrapper, as well as how to do FRAPI traces. My guess is you'll be impressed.
@Eric, I don't doubt it. If I were in business for myself, I would be purchasing it. Since I'm only poking around my own server and such, it's a little steep for me (for the moment). I will say that I might suggest a copy of it for work, but I'm not sure they'll go for it (small shop).
@Todd - Yeah, it would be nice if they provided a developer-only version for free (such as if it only worked in Developer-mode CF installs) - they'd probably make a lot of sales that way =)
@eric and todd,
you do know that you can save the receipt and write the purchase off at the end of the year.
@Tony: If you're running your own business, it'd be the smart thing to do. I don't run my own business.
@Eric,
Have you tested this on CF 7? I'm curious to know if cfcontent streams on CF 8 only.
@Kurt:
I hadn't tested it on 7 when I posted earlier. I'm running a test now. 25 meg into the file, memory is actually down slightly from when I started - a garbage collection clearly ran (started at 111 mb, down to 104 mb now). It looks like CF7 also buffers.
It should be noted, I'm looking at used memory, not allocated memory. The difference is allocated memory is how much RAM the JVM has claimed for itself while used memory is how much of the allocated memory is in use by resident objects (it's always smaller than allocated memory). This is the real measure of actual usage, and has to be reported by the JVM, not by the operating system (FusionReactor shows you both).
@Eric,
Thanks for the testing. Awesome stuff.
Hi,
Great to hear some feedback on FusionReactor. Our team is very active in the community and love to hear feedback (negative & positive). I'll certainly pass on your comments.
You can see an online demo of FusionReactor from our website but the best way to see it is download and install it (time limited demo version, don't worry, it just switches off when your time limit expires or, you can buy a license) - I'm sure you'll be impressed and see what a valuable tool it is.
If you're heading to Scotch-on-the-Rocks (UK) or CFUnited (US) then come along and meet some of the team to see FusionReactor and the other tools in our Fusion product suite.
Thanks,
D




