Boost PHP Speed With “If-Modified-Since” [1/4]
October 1, 2006 § 2 Comments
Improve PHP Performance
Compressing script output is great, but what if you could frequently circumvent the need to run scripts altogether?
So, this week we’ll talk about another misunderstood and underused web technology – the IMS header.
Part I: Understanding IMS
Implementing IMS for WordPress-generated pages on VibeTalk proved to be more than a small task, so I’m going to break this tutorial into several parts:
- Part I: Understanding IMS
- Part II: Watching IMS In Action
- Part III: Using IMS For Optimized RSS Feeds
- Part IV: Implementing IMS On WordPress
As of this moment, I have a pretty good outline of these parts, but since I haven’t completed the other steps, this information may evolve as I learn more. Our frustrations about this topic will evolve together!
First, let me take a stab at a simple explanation of IMS…
How Does It Work?
IMS is part of the HTTP specification designed to improve performance for scripts like PHP and ASP. IMS is part of the overall concept of caching and is especially important for scripts that generate web pages “on-the-fly”.
Static HTML files have a built-in “last modified” timestamp, but because dynamic content is generated at the time of request, it is “last modified” for each request. How can you determine if a page will be different than a previously-requested version?
Requests that include an IMS date allow the web server to determine if the content has changed since last requested.
Request / Response Overview
By requesting a web page with an IMS date, a browser is saying to the server, “I would like a web page that I previously received from you on this date. Has it changed since then?”
If there has been no change, rather than perform expensive database queries, the server replies with a 304 header, or “Not Modified”.
If the content has changed since last requested, then the server generates the content (executes the script or PHP file) and returns the data with a standard 200 header (“OK”).
(illustration: Server Interaction on 304 “Not Modified”)
By not re-running the script code, servers should be able to respond within a fraction of the time it would otherwise take.
IMS versus The Last-Modified Header
You may ask as I did, “What’s the difference between IMS and the similar header value called Last-Modified?” As it turns out, they are essentially two sides of the same coin.
A browser sends an IMS date to the server; a server returns a LM date to the browser. The two work together along with another bit of information called an Entity Tag (ETag) to accurately identify a unique bit of content.
An ETag is a unique content identifier generated by the server. It can be “weak” or “strong” and is what’s called an “opaque tag”. This just means that the value is entirely up to the server (and you), but it must be unique or your caching method may fail.
When a browser gets an ETag, it is required to save it and resend it with the next request as an If-None-Match header (more on the If-None-Match later).
Caching Affects More Than Your Browser
The big lesson so far is that caching is a large concept affecting more than one browser’s performance. Mechanisms for caching were built to allow proxy servers, content aggregators and indexing robots (a.k.a. “bots”) to share the burden of distributing information and intelligently collect data. This wide-ranging goal is part of why a simple IMS “switch” is more complicated than it seems.
If-Modified-Since is one of several caching headers designed to allow servers to circumvent the need to re-run scripts if content hasn’t changed since last requested.
Automated indexing tools and content aggregators can severely burden ALL servers by uneccesarily hogging internet bandwith. Because of it’s general-purpose design, not only can IMS improve your server performance, it can potentially reduce overall Internet usage for RSS/ATOM feeds and indexing robots by orders of magnitude.
The browser/server responsibility for IMS are as follows:
- If this page is in the local cache, the browser sends an IMS date when requesting the same page again
- If the server provided one, the browser sends an ETag (unique identifier previously provided by the server for this document/web page)
- If requested content has not been modified since IMS date, the server skips page creation and sends a 304 Not Modified response
- If requested content is updated, the server sends a page with new Last-Modified date
- The server generates and sends an ETag
I, for one, want to do my part to make the Internet more efficient for all.
Stay tuned for more on how we can help make this happen…