WebFaction
Community site: login faq

I have a normal WebFaction Static/CGI/php Apache app (not a Static-Only nginx app).

I was surprised to see that WebFaction's servers didn't seem to be generating ANY Expires or Cache-Control headers for any of my files (html or images or otherwise) by default.

Is this normal or is there something likely wrong with my config/account?

I'll assume this is normal for the rest of this post...

I was happy to see that I can control and add Expires and Cache-Control from .htaccess using this code:

<IfModule mod_expires.c>
    ExpiresActive on
    ExpiresDefault "access plus 3 days"
</IfModule>

with more fancy directives to make the choice file- and file-type specific.

HOWEVER I think it's fair to say that the vast majority of site owners are not even aware of the issue, until one day they change their site and are frustrated and shocked to find that many customers' browsers will not show them the new site until an unpredictable, customer-specific amount of time has passed.

So the issues I'd like to discuss here are

  1. what should the default behavior be given no explicit .htaccess config?
  2. whatever the default behavior is, even if it is no headers, can we document it in the WebFaction documentation for Static/CGI applications?

It seems like a bad idea not to generate some headers by default, since that means that once browsers get the file, they will cache it for an unknown amount of time (assuming users do not explicitly hit their refresh button, which of course we cannot assume they will do or tell them to do), which is not likely what site owners here want.

WebFaction does generate an Etag header, but Etag is not relevant for this discussion: ETag is an optimization the browser applies AFTER the document has passed its expiration time. We are discussing here how the browser (and intermediate caches) figure out the expiration time in the first place.

For example see this Mozilla document which explains how Firefox, when presented with a document that lacks Expires and Cache-Control, will use a bizarre heuristic and cache each document for 10% of the amount of time between the request Date and the Last-Modified time (!!) which is far from obvious or expected, and of course other browsers will do it differently.

More generally, the main RFC 2616 which defines how we are supposed to do HTTP caching states:

13.2.2 Heuristic Expiration

Since origin servers do not always provide explicit expiration times, HTTP caches typically assign heuristic expiration times, employing algorithms that use other header values (such as the Last-Modified time) to estimate a plausible expiration time. The HTTP/1.1 specification does not provide specific algorithms, but does impose worst-case constraints on their results. Since heuristic expiration times might compromise semantic transparency, they ought to used cautiously, and we encourage origin servers to provide explicit expiration times as much as possible.

Other ISPs I have used chose some reasonable expiration value as default and always send Expires and Cache-Control headers, simply so that there would be predictable default behavior they could explain to their customers. The last ISP I migrated from chose a few days but pretty much any value should be ok as long as it is documented.

So, what do you think? Can we set some default and also document the choice (even if the choice is to change nothing, so as at least to make people aware of the issue) in the documentation?

asked 22 May '16, 18:50

cpirazzi
209113
accept rate: 16%

edited 22 May '16, 18:53


We don't do any caching by default.

As you noted, you can control caching yourself via mod_expires directives. You can do this in .htaccess for Static/CGI/PHP apps and for other CGI and PHP apps installed via our control panel. You can also use these directives in httpd.conf for Django and mod_wsgi apps, which come with their own private back-end Apache instance.

For static-only apps, if you use "expires max" in the "extra info" field for the app, you'll have long-term caching, ie per the Nginx docs:

The max parameter sets “Expires” to the value “Thu, 31 Dec 2037 23:55:55 GMT”, and “Cache-Control” to 10 years.

Without "expires max", there is no caching.

I'll pass your your request for documentation along to our docs team for consideration.

permanent link

answered 31 May '16, 20:49

seanf
12.2k41836
accept rate: 37%

Hi,

Thanks for passing on the doc request.

But I think you might not be seeing my main point...

The fact that WebFaction doesn't generate Expires or Cache-Control headers by default does NOT mean that no caching is happening. Caching happens at many layers including the user's browsers, and no headers just means "do what you want."

The fact that WebFaction doesn't generate these headers means that the amount of caching (specifically, the time till expiration when the browser will even check to see if a page has changed at the server) is UNPREDICTABLE.

So what I am saying is that WebFaction's default behavior is to be unpredictable, and that is certainly not a good idea.

That unpredictability could prevent a WebFaction customer's site update from being seen by their real-world customers for hours, days, or months, depending on the whim of the browser code and intermediate caches. For example, if the customer has a well-established WebFaction site that hasn't changed for 1 year and then they make a change, many browsers won't even test to see if the site's pages need to be re-fetched (they won't even check for changes) for 1.2 months!

And no, you can't tell your thousands of customers to click their refresh button (not only because that's silly, but also because of this caching issue, you cannot deliver ANY new content to their eyes AT ALL until the unknown browser-specific expiration period expires: it's hard to overstate how nasty this caching issue is once you're caught in it).

This is not good and not necessary: with a judicious choice of default header in place from the beginning, WebFaction can completely avoid this problem by telling the browsers when to check back explicitly. Other ISPs I have seen provide a reasonable default header to prevent customer surprises just like this.

It is super wonderful that WebFaction gives us full control over headers; we all respect WebFaction for giving us control (which many ISPs do not) and no need to even discuss that in this thread.

For this thread I am proposing that WebFaction choose a default Expires and/or Cache-Control header which is least likely to cause problems for customers, most of whom are totally unaware of any of these caching issues until one day they get stung and it is too late.

For example, choose an expiration time in the range of of 3 days to 1 week as a reasonable default.

I really like WebFaction's programmer- and technician-oriented focus, but sometimes having such a focus can make us lose sight of the common case and end up making policies that hurt most people---since most people are not aware of the very confusing and intricate details of how caching works on the web. We can say "RTFM" or "ignorance of the law is no excuse" but that really seems absurd and silly, especially because.....

It's very hard to argue for the current WebFaction default behavior of not generating Expires or Cache-Control headers at all: again, the current behavior does not mean NO caching (that is a common misconception)---it means UNPREDICTABLE caching. That cannot possibly be better than predictable caching under any circumstances that I can think of.

Even arguments of the form "WebFaction wants more off-site caching to save WebFaction server resources" fail because the current behavior prevents even WebFaction from knowing how much caching they are getting: it is up to the browsers' whim when to check back. By choosing a default expiration time, it will be good for WebFaction's customers (since we avoid surprise update issues) AND for WebFaction itself (since WebFaction can now reliably predict server load for the vast majority of its customers who do not set custom caching headers).

(28 Jun '16, 08:45) cpirazzi

Ok, I've passed your argument along to our sysadmin team.

(28 Jun '16, 20:10) seanf
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×9
×8
×3
×1
×1

question asked: 22 May '16, 18:50

question was seen: 2,457 times

last updated: 28 Jun '16, 20:10

WEBFACTION
REACH US
SUPPORT
AFFILIATE PROGRAM
LEGAL
© COPYRIGHT 2003-2019 SWARMA LIMITED - WEBFACTION IS A SERVICE OF SWARMA LIMITED
REGISTERED IN ENGLAND AND WALES 5729350 - VAT REGISTRATION NUMBER 877397162
5TH FLOOR, THE OLD VINYL FACTORY, HAYES, UB3 1HA, UNITED KINGDOM