This is the fifth part of a five part article. Read part four here.
While our twenty tips for code optimization in Part I began with the developer’s source code, our discussion of cache control in Part II led us steadily towards the server-side in our quest for maximum Web site performance. If you are willing to put on your admin hat for a while, we will now see what other server-side changes can be made in order to speed up site delivery, starting with HTTP compression.
What Exactly Is HTTP Compression?
HTTP compression is a long-established Web standard that is only now receiving the attention it deserves. The basic idea of HTTP compression is that a standard gzip or deflate encoding method is applied to the payload of an HTTP response, significantly compressing the resource before it is transported across the Web. Interestingly, the technology has been supported in all major browser implementations since early in the 4.X generation (for Internet Explorer and Netscape), yet few sites actually use it. A study by Port80 Software showed that less than five percent of Fortune 1000 Web sites employ HTTP compression on their servers. However, on leading Web sites like Google, Amazon, and Yahoo!, HTTP content encoding is nearly ubiquitous. Given that it provides significant bandwidth savings to some of the biggest sites on the Web, progressive administrators owe it to themselves to explore the idea of HTTP compression.
The key to HTTP content encoding can be found in the Accept request headers sent by a browser. Consider the request from Mozilla Firefox below, and note in particular the Accept, Accept-Language, Accept-Encoding, and Accept-Charset headers:
GET / HTTP/1.1
Host: www.port80software.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040206 Firefox/0.8
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9, text/plain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
These “Accept” values can be used by the server to determine the appropriate content to send back using Content Negotiation – a very powerful feature that allows Web servers to return different languages, character sets, and even technologies based on user characteristics. Content negotiation is a very broad topic and we will focus solely on the element which relates to server-side compression. The Accept-Encoding header indicates the type of content encoding that the browser can accept beyond the standard plain text response, in this case gzip- and deflate-compressed content.
Looking at Internet Explorer’s request headers, we see similar Accept-Encoding values:
GET / HTTP/1.1
Host: www.google.com
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)
Accept: image/gif,image/x-xbitmap,image/jpeg,image/pjpeg, application/vnd.ms-excel,application/vnd.ms-powerpoint,application/msword, application/x-shockwave-flash,*/*
Accept-Encoding: gzip,deflate
Accept-Language: en-us
Connection: keep-alive
Given that nearly every major browser in use today supports gzip and deflate encoding (and that those few that don’t should not be sending the Accept-Encoding headers), we can easily modify Web servers to return compressed content to some browsers and standard content to others. As in the following example, if our browser tells Google that it does not accept content encoding, we get back 3,358 bytes of data; however, if we do send the Accept-Encoding header, we get back compressed data of just 1,213 bytes along with a response header saying Content-Encoding: gzip. You won’t see any differences in the pages if you “view source,” but if you have a network trace, you will notice the response to be different:
Figure 1: Google Compressed / Uncompressed Comparison
While the files are small in this case, you can see that the reduction is still significant, in this case a 74 percent smaller file size. Through a combination of HTML, CSS, and JavaScript code optimization (as discussed in Part I) and HTTP content encoding, Google achieves an impressive feat – fitting its page into a single TCP response packet!
While Google may have bandwidth concerns far beyond those of the average Web site, HTTP content encoding can decrease HTML, CSS, JavaScript, and plain text file size by 50 percent or more. Unfortunately, HTTP content encoding (the terms “compression” and “content encoding” are roughly synonymous) really only applies to text content, as compressing binary formats like image files generally provides no value. Even assuming that binary files make up the bulk of the payload of the average site, you should still see, on average, a 15-30 percent overall reduction in page size if HTTP content encoding is used.
Server Support for HTTP Content Encoding
If you are already convinced of the value of HTTP compression, the next big question is: how do you employ it? In the case of the Apache Web server, it is possible to add HTTP content encoding using either or mod_deflate. In the case of Microsoft IIS, things can get a little sticky. While IIS 5 includes native support for gzip encoding, it is a notoriously buggy implementation, especially considering the fine-grained configuration changes that must be made to overcome a wide variety of browser nuances. So in the case of IIS 5, third party compression add-ons in the form of ISAPI filters like httpZip are most often the best way to go. IIS 6 built-in compression is much faster and more flexible, but it is still difficult to configure in more than a basic manner without getting into the IIS Metabase. ZipEnable represents the first tool designed to allow for truly fine-grained management of IIS 6 built-in compression — and for browser compatibility checking as well.
The Real Deal with Server-Side Content Encoding
There is an important trade-off to be considered when you implement HTTP compression; if you configure your server to compress content on the way out, you may reduce bandwidth usage, but at the same time, you will increase CPU load. In most cases, this is not a problem, especially given how little work Web servers actually do. However, in the case of a very highly trafficked Web site running a large proportion of dynamic content on servers that are already at the limit of their available CPU cycles, the downsides of compression may actually outweigh the upsides. Adding additional server hardware would, of course, alleviate the problem and allow you to enjoy the substantial bandwidth savings offered by compression, but, we will leave it to you to do the number crunching to determine if the reduction in bandwidth expenses and other infrastructure costs (fewer routers, switches, and dedicated lines) outweighs that upfront investment in new hardware.
Ultimately though, the most interesting aspect of HTTP compression is what developers and administrators expect to see when rolling it out, versus what they actually see. While you will definitely find that bandwidth utilization goes down, all of your users may not see dramatically faster page loads. Because of the increased CPU load created by the compression and decompression process, time to first byte (TTFB) generally increases, and thus the browser can’t start painting the page until slightly later. For a user with a slow (that is, a low bandwidth) connection, this is still a good trade-off; because the data is compressed into fewer, smaller packets, it will be delivered much faster, such that the slight initial delay is far outweighed by the faster overall page paint. However, in the case of broadband users, you will probably not see a perceptible performance improvement with HTTP compression. In both cases, however, you will save money through bandwidth reduction. But if perceived response time is your primary goal and you serve a lot of dial-up traffic, you may want to first focus on caching (discussed in Part II) as a performance enhancement strategy.
Finally, another potential problem with HTTP content encoding relates to server load from script-generated pages, such as those in PHP or ASP. The challenge in this case is that the page content may have to be recompressed for every request (rather than being compressed once and then cached), which will add significant load to the server beyond that added by the compression of static content. If all your pages are generated at page load time, you should therefore be careful when adding HTTP content encoding. Fortunately, many commercial compression add-ons will know to cache generated content when possible, but be aware that some cheaper solutions lack this vital feature. However, this “problem” does point to a second, obvious server-side performance improvement – page pre-caching.
Dynamic Pages: Build Them Now or Build Them Later
Interestingly, many developers dynamically build many or all of their site’s pages at visit time. For example, http://www.domain.com/article.php?id=5 might be a common URL that suggests a page being built from some database query or template fill-in. The problem with this common practice is that, in many cases, building such a page at request time is pointless, since most of the time it is primarily static. So-called static dynamic pages, or scripted pages, whose contents don’t change for long periods of time, obviously do nothing to improve page delivery time. In fact, on a heavily trafficked site they can needlessly bog down your server.
One approach to address unnecessary dynamic page generation would be to pre-build the page into a static content page complete with a .html extension every time it changes. Preferably, any such generated pages would not only be static HTML files, but would also be code-optimized versions (as discussed in Part I). Not only does this make it easier for the server to deliver the page quickly, because it doesn’t have to do any work before it returns it, but the technique actually makes the site more search engine friendly.
Unfortunately, in many cases, simply generating pages to static HTML is not possible because the pages do indeed have dynamic content that must be executed at page view time. In this case, your best bet is to “bake” the page into a faster-to-execute form. In the case of ASP.NET, this would be the byte code format, which is much faster for the server to execute. Unfortunately, you need to force the byte code by executing pages first, before letting your users access them. Fortunately, upcoming versions of ASP.NET 2.0 are going to help mitigate the tiresome task some developers currently undertake of “touching” all of their pages once to ensure quick first downloads for users. In the case of PHP, you may find purchasing a product like the Zend optimization suite to be a wise investment.
Given the differing requirements of serving static and dynamic pages, or HTML and images, it seems wise to revisit the physical server and other hardware for additional acceleration ideas. One of the more intelligent impulses that can lead people to add or beef up hardware for the sake of Web acceleration is the idea of specialization – that different elements are required for doing different jobs with maximum efficiency. Even though, in the name of cost-efficiency, our main focus in this article has been on the source code and the Web server (and related) software, let us now take a quick look at these other elements.
Turbo Charging Your Web Server
A good place to turn in order to speed up your site is the server software and hardware itself. Starting with software, it is fairly unlikely that Web administrators are going to quickly dump IIS for Apache or Apache for IIS, even when security, ease of use, or performance is cited as a reason to leave one camp for another. Put simply, the Web server and its associated operating system are often so intertwined with the Web site(s) that migration becomes an onerous and even risky task. If you do start thinking about dumping one platform for another, you should also seriously consider alternate platforms like Zeus if pure speed is your primary concern.
When it comes to hardware, carefully consider the main tasks that the server(s) performs. In the case of a static Web site, it is primarily to marshal network connections and to copy files from disk to network. To accelerate such a site, you want to focus on building a Web server with very fast disk and network subsystems, as well as enough memory to handle simultaneous requests. You may, in fact, choose to get very aggressive in adding memory to your system and try to cache all the heavily used objects in memory in order to avoid disk access altogether. Interestingly, processor speed is not nearly as important as you might think in a site serving static files; it does help, but the disk will often prove to be a bigger bottleneck. When the site is serving dynamic as well as static pages, obviously processor speed becomes more important, but even in this case, having a fast drive or dual NICs can still be more important. On the other hand, the combination of a lot of dynamic page generation with other processor-intensive responsibilities like SSL and/or HTTP compression makes it more likely that the CPU is the resource in need of enhancing. In other words, what the server does will dictate how it can most effectively be beefed up.
Even if you don’t have it in your budget to customize your server hardware or software, there might be some changes you can make fairly cheaply. For example, you might consider tuning your operating system’s TCP/IP settings for the most efficient use of those underlying TCP/IP resources on which HTTP is dependent. This might be a matter of adjusting the TCP receive window to a size best suited to your application and your server’s network connection, or making sure that TCP features that can be toggled, such as delayed ACKs or TCP_NODELAY, are used or not used, depending again on the characteristics of the application and the network environment, or simply making sure your box is not suffering from port exhaustion due to excessive TIME_WAIT times or other causes. However, before twisting the knobs on your network characteristics, both in the Web server and on the operating system, make sure to set up realistic load tests to verify that your improvements don’t actually slow down users or cause excessive retransmits on your network. In other words, don’t try these types of fixes unless you understand how to measure their value.
Acceleration by Division of Labor
Another consideration when addressing site acceleration is that not every form of site content has the same delivery characteristics. Given that different content has different characteristics, we may find that dividing the specific serving duties of a site between several servers is better than simply duplicating site duties evenly across a server farm.
A simple example of acceleration via division of labor is when your site uses SSL encryption for a shopping cart or extranet. In the case of SSL you might find that your server bogs down very rapidly when you have multiple users in the HTTPS sections of your site given the significant overhead introduced by the encryption. In this situation, it makes sense to offload this traffic to another server. For example, you might keep your main site on www.domain.com and then for the checkout portion of the site link to shop.domain.com. The shop.domain.com could be a special box equipped to handle the SSL traffic, potentially using an SSL acceleration card. Segmentation allows you to focus the necessary resources on those users in checkout without bogging down those who are just browsing. You might even find that serving your images or other heavy binaries like PDF files or .EXE downloads from one server might make sense, since these connections take much longer to shut down than ordinary connections and, in the meantime, hold valuable TCP/IP resources hostage. Furthermore, you would not bother with HTTP content encoding on the media server, but would, of course, apply it to the primary text server for HTML, CSS, and JavaScript. In this way, you maximize the use of that server’s otherwise underutilized CPU for the sake of bandwidth optimization, while leaving the CPU of the media server free to do its job.
Segmentation can be further applied to generated pages. You might consider having generated pages served on a box built for that purpose, offloading your static content to a different machine. Already many sites employ just such a scenario by using a reverse proxy cache such as Squid. In this setup, the proxy serves static content quickly, while the back-end server builds the content that truly must be generated at page view time. The cache control policies and rules discussed in Part II of this article become quite important in this type of setup; you will want to make sure that the proxy server’s cache stores everything that it is safe to store in a shared cache, and that it doesn’t retain anything that is intended only for a particular user.
Speed for Sale
We have focused primarily on low cost acceleration techniques in this article, but as we get towards the end, larger expenses appear in the form of new software and new hardware. It is possible to spend quite a bit of money on Web acceleration by purchasing exotic acceleration appliances that perform connection offloading, compression, caching, and a variety of other functions for your site. If you have an extremely high bandwidth bill, these solutions can be very effective, but most sites will see excellent results with the cheaper approaches we have discussed, such as code optimization, caching, and HTTP encoding.
Yet even if you did have a lot of money and were able to build out a specialized server farm and add top-of-the-line acceleration devices, you would eventually reach a limit to the benefits gained from compressing and caching content. To improve your speed now you have only one last step: move the content closer to the user (and if possible, apply all the same techniques again). Already you have probably noticed that sites that offer downloads may mirror content for users to fetch from multiple locations. However, it is possible to provide content in a more geographically-sensitive manner, and to do so transparently. Content distribution networks (CDNs) like Akamai allow us to move heavy content such as images and other binary downloads closer to users by taking advantage of thousands of edge caches located around the world. While this provides significant performance improvements and is widely used by the world’s largest sites, it certainly does not fit in the category of cost-effective, and thus we mention it only to follow our acceleration options to their logical conclusion.
*Originally published at Port80Software.com
Thomas Powell is founder of PINT, Inc. and a lecturer in the Computer Science
department at University of California San Diego. His articles have appeared in
serveral magazines and sites, including Network World, Internet Week
and ZDNet. He has also published numerous books on Web technology and design,
including the best-selling Web Design: The Complete Reference. Visit pint.com.
Joe Lima is the Director of Product Development for Port80 Software. He has
worked for a variety of Internet, wireless and software development companies,
specializing in research and development for server-centric technologies. Visit
port80software.com.