Recently, I've been using Pusher to send realtime data to the client over native HTML5 WebSockets. This has been a lot of fun; but, in order to get a real sense of just how good Pusher makes my web development life, I wanted to see what kind of pain I would experience if I tried to use older, realtime techniques like long polling. In long polling, the client makes a request to the server as it normally would; however, rather than simply responding with data and then closing the connection, the server holds the connection open and periodically flushes data to the client whenever it becomes available. The client then monitors this open connection for any updates to the responseText buffer. I wanted to see if could implement this kind of communication channel using jQuery and ColdFusion.
I went looking for the pain of long polling and it is certainly pain that I found. Even after hours and hours of going over this code, I simply could not get long polling to work consistently. Some page requests seemed to work fine while others failed miserably. Some chunks of data were flushed to the browser; other chunks of data mysteriously disappeared into the ether of HTTP connections. I present this code not as an example of how things can be done, but rather, as a demonstration of why services like Pusher need to exist!
This experiment was meant to be as simple as I could possibly make it. My Application.cfc ColdFusion framework component set up an application-cached array of messages:
<cfcomponent output="false" hint="I define the application settings and event handlers."> <!--- Define the application settings. ---> <cfset this.name = hash( getCurrentTemplatePath() ) /> <cfset this.applicationTimeout = createTimeSpan( 0, 0, 10, 0 ) /> <cffunction name="onApplicationStart" access="public" returntype="boolean" output="false" hint="I initialize the application."> <!--- Init the application message log. ---> <cfset application.messages =  /> <!--- Return true so the page can process. ---> <cfreturn true /> </cffunction> </cfcomponent>
This array - application.messages - could then be added-to using a separate ColdFusion page:
Add.cfm - Adds Messages To The Queue
<!--- Lock access to the message queue since we adding to it here and clearing it out on the long-poll page. We don't want to get some race condition bewteen these two pages. ---> <cflock name="messageQueue" type="exclusive" timeout="5"> <!--- Create the message. ---> <cfset message = "Hey there, it's now #timeFormat( now(), 'hh:mm:ss TT' )#." /> <!--- Add a message to the collection. ---> <cfset arrayAppend( application.messages, message ) /> <!--- Output the message - for debugging purposes only. ---> <cfoutput> #message# </cfoutput> </cflock>
The client would then make a request to the long polling page, poll.cfm. This long polling page would then hold the connection with the client open. Periodically, the long polling page would check for updates to the cached message queue; and, if there were messages present, it would flush them to the client (as JSON) and clear the message queue.
Poll.cfm - The Long Polling Page
<!--- Set a larger request timeout so we can keep the long polling going, which will allow us to send on-demand data down to the user over the wire. ---> <cfsetting requesttimeout="#(60 * 1)#" enablecfoutputonly="true" /> <!--- At some point this long-poll page is going to timeout. Wrap it in a try/catch so we can cleanly handle that timeout. ---> <cftry> <!--- Just start looping indefinitely. This will allow us to periodically check to see if information needs to be flushed to the client. ---> <cfloop condition="true"> <!--- Lock access to our message queue since we are clearing it here and adding to it on another page. I want to make sure I don't get some odd race condition. ---> <cflock name="messageQueue" type="exclusive" timeout="5"> <!--- Check to see if we have any messages that have yet to be flushed to the client. ---> <cfif arrayLen( application.messages )> <!--- Output the serialized data to the client. Be sure to include are data chunk delimiter so the client knows how to parse valid JSON packets. ---> <cfoutput> #serializeJSON( application.messages )# </cfoutput> <!--- Clear the message queue so we don't flush duplicate entries to the client. ---> <cfset application.messages =  /> </cfif> <!--- Add a line break so we always have something to flush to the client. ---> <cfoutput>::DATA::#chr( 13 )##chr( 10 )#</cfoutput> <!--- Flush the content to the client. ---> <cfflush interval="1" /> </cflock> <!--- Allow the thread to sleep for a moment so we don't constantly hammer the client with flushing. Alos, this gives the server a small rest and provides an opportunity for data to actually change on the server. ---> <cfthread action="sleep" duration="250" /> </cfloop> <!--- Catch the page timeout. ---> <cfcatch> <!--- Simply abort - there's nothing that we can do at this point. The client will have to make a subsequent request for another long-poll connection. ---> <cfabort /> </cfcatch> </cftry>
As you can see, the long polling page enters an indefinite loop - this is how it maintains the connection with the client. Inside each loop iteration, it checks for queued messages and then sleeps the request for 250 milliseconds. If it finds queued messages, it serializes them as JSON and flushes them over the connection to the client. It then becomes the client's responsibility (as you'll see below) to monitor the connection and parse individual data flushes as valid JSON packets.
If you watch the video above, you can see that this approach is really hit or miss. Sometimes, it appears to work nicely; other times, it's a complete failure. But that's OK - that is exactly what I wanted this experiment to be; I wanted to see how the old-school approaches like long polling result in pain such that I might further appreciate the ease and power provided by HTML5 WebSockets. I know that this was my first time trying long polling; and, I am sure that professional solutions like BlazeDS are able to create consistent functionality with long polling. But, with the realtime usability that services like Pusher provides, I am not sure at this point why I would ever want to use long polling.
I have a simple demo I wrote a while back with basic chat functionality using ColdFusion/jQuery. I found it was surprisingly stable, although I never really did anything with it.
Not so much as plug, mentioning it more as another example.
Hey Robert I think you example is more of polling then long polling because its hitting the server at a specified interval instead of the server holding the connection waiting for an update. This might explain why its more stable. I think the biggest issue with long polling is that each long polling connection is taking up a thread in JRun and JRun just is not designed to handle that many threads.
Very true. I should have looked more closely at Ben's code.
The interval polling is probably better than the long polling anyway... at least in my experience. Requests that execute and end seem to be very stable and predictable. As you can see, I found the long polling to be completely random in its fidelity.
Of course, I don't really want to adopt a long polling approach - I just wanted to see what life would be like without better technique available. I'll be sticking with the WebSocket "push" when ever possible.
Yes - Robert's example is just polling every 5 seconds. The problem with that is scaling and speed of updates. As you add more users you add more requests. To make the updates timely you have to poll faster making the problem worse.
With sockets the server pushes the updates to the clients without them having to request anything. And they only update when needed.
The other feature the socket server APIs offer that is neat is the ability to trigger/bind to different events.
@Ben - As always enjoying your posts and experiments.
Yeah, the "push" stuff is just all around cool! Now that I have played around with long-polling a bit, I'll be happy to slip back into some serious push action.
Do you think you could use something like this for the Twitter streaming API?
Basically, you have to hold the connection open indefinitely (on the server) and process each line as a JSON data packet.
I never was able to figure out how to get this to work in CF. I tried event gateways, cfhttp, scheduled tasks - everything. I wound up using PHP/curl to create flat files and then had CF do the parsing.
Using something like this with the Twitter API is definitely something I had in the back of my head when I started to look into this. I also have no clue on how to do something like this in ColdFusion (at least not yet!!!).
Ben, long polling has been possible for a while, but the problem is that as soon as you get some traffic it will kill your CF server ...
The good news is that CF already comes with a solution... Event Gateways and XML Sockets are great for maintaining connections and pushing data ...
I built a prototype 4 years ago using CF XML Sockets Event Gateways, Flash, and jQuery ... Flash because it was the only solution to connect to a socket back in the days :: http://www.robgonda.com/blog/index.cfm/2006/3/20/Behold-ModelGlue-users-MGAJAX-Made-easy
Would be awesome to see html web sockets connect to a CF based XML Socket and push data through Event Gateways ...
Yeah, I bet the long polling would KILL your server in no time; to be totally blunt, I felt uneasy just testing this in the first place :) In the back of my mind, I just kept thinking I was going to accidentally take down my server (granted it was my local dev server, but still).
Gateways are very interesting things. I never really got into them because I think they used to be Enterprise only. Now that they are standard, it would be cool to start seeing what kind of sockets connections and what nots I could create. Good thinking!
It seems the inconsistency is coming from the CF server. When you were scrolling through the response headers, it does appear the page updates whenever the data is being pushed. It appears the data just isn't being pushed from the application queue.
I believe facebook chat uses long polling, and considering the scale, it seems to work fine.
Yeah, that's does appear to be what Firebug is telling me; however, what I was finding (not shown in the video) was that if I hit the poll.cfm page direclty in the browser, the output written to the page seemed to much *more* consistent. There seemed to be something happening between what the browser could access from the page directly vs. what could be accessed in the AJAX request.
I'll try to play around with this a bit more. I don't like that I wasn't able to get this working consistently.
I implemented long-polling based on your JQuery code. It works fine in Firefox, but there is a problem in Chrome. Somehow it reacts on upcomming data only after some time (approximately after 1 minute). Or maybe it has some buffer and it starts to react only after it is filled. Do you know what it could be? Thanks in advance.
It looks like some of the browsers implement some sort of buffering on the data request. I was definitely finding different behavior across browsers. I want to come back and figure this code out a bit more; but as far as this goes, it was definitely just an experiment - I am not sure I would recommend this approach over client-side polling.
Thank you for your answer. If you are interested in - I solved the problem. It was, as you said, a buffer issue. Now when I'm getting a new request, the first thing I do is I'm sending some fake data. It works perfect.
Can you expand on that a bit? How are you overcoming the difference in behavior? Is it working consistently for you?
actually everything wasn't as good as I said before. I fixed the problem with Chrome as I told. And it works rather good(the only problem is a spinning throbber in Chrome), but it completely doesn't work in IE. I found a solution with "forever iframe". It works in IE, Chrome and Firefox. But in this case we get a spinning progress indicator in every browser. Also I found closure-library which probably should solve this issue, because Google uses it within it's products where it works fine. But I haven't tried it yet. Here is the link: http://closure-library.googlecode.com/svn/docs/class_goog_net_BrowserChannel.html
Have you had the same problems? Or maybe you have some ideas how to fix it?
Seems like cool stuff. I've been meaning to come back to this topic and play around some more. But, I've been dipping back into the "realtime push" approach to notifications. Mostly, I just want to figure out why this was buggy.
I don't know much about Google's closure language, but that BrowserChannel thing looks very cool!
According to the Peter Lubbers, et al., book Pro HTML5 Programming, start of Chapter 6, there are a couple of other charming terms for what you've called long polling (leaving the response part of the HTTP request alive and flushing periodically): "hanging GET" and "Comet".
I thought you'd get a kick out of "Comet" as a server-controlled analog of "AJAX".
It sorta reminds me of how there was a searchable archive of FTP sites called Archie (short for archive), and then, when someone came up with a searchable archive of gopher sites, they called it Veronica.
I still have Peter Lubbers book on my desk. I haven't yet started it as it was the thickest of the HTML5 books I had purchased (yes, I started with the smallest ones first). I have also heard of Comet, but was not sure if that was a technique or an actual project/framework. Looks like it is a generic term. I also like "hanging GET" :)