EdgeCast CDN Won't Cache Your First Request And Can't Cache Subsequent 304 Not Modified Responses
EdgeCast is a pretty amazing CDN (Content Delivery Network). But, it took me a little while to understand the quirks in its caching strategy. When I first started using EdgeCast, it seemed like I could never get anything to cache (as defined by the "X-Cache" header in the response). After some back and forth with their support, I came to realize that EdgeCast will never cache the first request for an object; and, it won't be able to cache subsequent 304 Not Modified responses from the Origin.
I understand not caching the first request for an object - why cache something until you know that it will be requested more than once? But, the 304 Not Modified behavior makes things a little tricky. Clearly, it can't cache a 304 Not Modified response as there is no "body" to cache - only response headers. But, if your Origin server will respond with a 304 Not Modified value based on ETag / "If-None-Match" / "If-Modified-Since" headers, you end up in a situation in which the first user never gets the cached object until a second user makes a request for the same object:
Typically, this isn't a problem since the first user will likely be pulling the object out of its browser cache after the first request. And, even if the request does go back to the CDN with cache headers, the 304 Not Modified pull from the origin is very light-weight.
That said, I found the workflow a bit confusing when I first encountered it; so, I wanted to see if I could mess with the EdgeCast rules engine to get a different caching workflow. The EdgeCast CDN has a really robust "rules engine" in which you are given granular control over many aspects of the incoming client request (EdgeCast to Origin) and the outgoing client response (from EdgeCast back to the user). One of the things you can do with the rules engine is strip out request and response headers.
According to the Amazon S3 documentation on "Conditional GETs", S3 will try to return a 304 Not Modified by looking at any of the following request headers:
- If-Match (etag)
- If-None-Match (etag)
- If-Modified-Since (modification date)
As such, I wanted to see what would happen if I had the EdgeCast rules engine strip out those headers as it forwarded the request onto Amazon S3 (the Origin server).
The results were of a mixed success. Stripping out the headers forces Amazon S3 to return a 200 OK response along with the target object. This does allow EdgeCast to cache the object more readily for the first user (such as if the user performed a hard-refresh of their browser). However, the downside to the given rules is that they appear to strip out the headers before EdgeCast evaluates the request.
This was unexpected! The EdgeCast documentation on "header deletion" is limited to following:
Deletes the specified request header. The specified request header will not be forwarded to the origin server.
When I read this, it sounded as if the Origin server is the only server that will be affected by the HTTP header modification. However, when I started deleting these cache headers, it also prevented EdgeCast from returning a 304 Not Modified response for objects cached within the CDN; even when I execute the request with "If-None-Match" headers, EdgeCast consistently returns a 200 OK, not a 304 Not Modified.
All in all, it seems as if deleting these cache-related headers is a Pyrrhic victory; while I can potentially get the object into the CDN cache sooner for the first user, I lose the ability to serve up a 304 Not Modified response for all users. This hardly seems worthwhile.
This exploration may not have been fruitful; but, I would like to say that the rules engine is a really great feature and it is the primary reason why I use EdgeCast with S3 (Amazon's Simple Storage Service) over something like CloudFront. CloudFront may have tighter integration with S3; but, it lacks any flexibility whatsoever. Good luck trying to pull a CloudFront image into an HTML5 Canvas tag without the canvas data getting tainted! With EdgeCast, you can easily add Access-Control headers to arbitrary images which solves that problem right nicely.
Anyway, there's wasn't much documentation on the EdgeCast caching workflow; so, hopefully this helps others who may be working with the EdgeCast Content Delivery Network.
Huge thanks for this article. We were also trying to figure out how EdgeCast works and just did not understand where all these 304s were coming from. It just didn't make any sense. We buy in EdgeCast from a reseller and so we don't deal directly with EdgeCast support. Our reseller's support also did not know about this, or at least was not able to explain it to us. Once our project team read this article it all suddenly made sense.
I'm really happy that this actually was helpful to someone. After loads of Googling, I couldn't find anything on this, so I thought maybe I was the only one having this problem :D
Re: EdgeCast reseller, I was in the same boat. I actually tried to open a support ticket with EdgeCast and they rebuffed me with a very unclear statement. Then, I tried to ask for clarification, but the ticket had already been closed, so all I got was an auto-reply :(
Anyway, glad I could provide some decent information.
We use Edgecast as CDN. We have setup max-age for static content like css, js, images, etc
When we load the page in chrome , dev tools shows 200 ok for all requests. But every successive reload of the page also shows 200 ok. While inspecting the response headers, I see X-Cache: HIT.
Im a bit confused, how can http status code = 200 ok and X-Cache: HIT occur at the same time. By my understanding 304 meant Not modified.
Can you please help shed some light on the above?