Better compression of web pages
Appetizers
For nearly 20 years, web servers have relied on Gzip compression to compress HTML, CSS, and miscellaneous text files, which the browser then receives and unpacks again, speeding up data transfer at the network bottleneck.
Gzip compression of dynamic content takes place at the server on the fly. The web server reads the file from the filesystem, pipes it through Gzip, and then delivers the result to the browser. Before the Gzip step, the server often calls a PHP module.
For static (e.g., CSS) files, the server sometimes performs compression beforehand and then stores .gz
files in the filesystem, saving CPU power on the server. Another option is Google's Zopfli [1] compression software. Although its performance is far slower than Gzip and uses more CPU capacity, it provides superior results that are still compatible with Gzip.
Over the past 20 years, better compression methods than Gzip have emerged from time to time. However, they have never made it into the web browser, simply because they all took too much time for on-the-fly compression. After all, if it takes longer to compress than it does to transfer the original files, it hasn't helped anyone when all's said and done.
Kneaded
In 2015, Google provided a solution to this dilemma with the MIT-licensed Brotli [2] software, which compresses files at the same speed as Gzip, but with a higher compression rate – as demonstrated (Listing 1) with the use of both tools on this example HTML document:
<html> <head> <title>Brotli Test</title> </head> <body> <p>Hello World!</p> </body> </html>
Listing 1
Gzip and Brotli Comparison
01 -rw-r--r-- 1 sw sw 124 8 Sep 16:52 hello-world.html 02 -rw-r--r-- 1 sw sw 77 8 Sep 16:53 hello-world.html.br 03 -rw-r--r-- 1 sw sw 113 8 Sep 16:53 hello-world.html.gz
The original file takes up 124 bytes of space on the hard drive. After compression with Gzip, this number drops to 113 bytes, which means a space savings of 9 percent. Compressed with Brotli, the file is only 77 bytes, or 38 percent smaller.
The main reason for Brotli's good performance with HTML files is that a 120KB dictionary is permanently stored in the program that contains the character strings most frequently used on websites (e.g., HTML tags). Brotli then refers directly to this entry and thus saves a good deal of space. On average, Brotli generates files that are 20 percent smaller than Gzip-generated files, which alone should please every server operator who has to pay for network traffic.
The key to success lies in browser support. Google has a home advantage thanks to its own Chrome browser, and after the developers set up Chrome with Brotli support, other browser manufacturers followed suit. Today, Brotli supports all major browsers [3].
Brotli on the Server Side
Things look less rosy on servers. Of all the web servers provided in the stable branches of the major Linux distributions, Brotli does not support a single one. If you want to offer compression software on Linux, you currently need to patch and recompile. However, this leads to an operating system that is difficult to update. At the latest, the next major release will probably come with official support for Brotli. Then, you will probably want to switch to the official package.
Here, I show how you can combine the available stable Nginx web server [4] with Brotli on Debian Stable 9.1 (Stretch) [5]. I assume you have a newly installed server, on which you have root privileges. Debian 9.1 already provides Brotli as standalone compression software. To begin, install the package with Git:
apt-get install brotli git
You now need the development libraries for the Nginx module. On the downside, they are currently only in the testing branch; on the upside, they do not have too many dependencies. Add the following line to the /etc/apt/sources.list
file:
deb http://deb.debian.org/debian experimental main
You can then install the required package in the terminal:
apt-get update apt-get install libbrotli-dev/experimental
Debian has no prebuilt package for the Nginx module, so you need to grab it from Google's repository and save it under /opt/ngx_brotli
:
cd /opt git clone https://github.com/google/ngx_brotli cd /opt/ngx_brotli git submodule update --init
Next, pick up the source code for the official Debian package from Nginx:
cd /usr/src mkdir nginx cd nginx apt-get source nginx cd nginx-1.10.3
The sources contain a debian/rules
file, which defines Debian-specific settings. Now, search for the extras
entry (this is the Nginx variant for most modules) and add the following line:
--add-module=/opt/ngx_brotli
Once all of the necessary files are in place and configured, build the new Debian packages:
dpkg-buildpackage -b
They then end up in the /usr/src/nginx
directory and can be installed via dpkg
:
dpkg -i nginx-common_*.deb libnginx-mod*_.deb nginx-extras*_.deb
To prevent the system from unintentionally updating the newly installed packages, you can set them to hold
with Apt:
apt-mark hold $(dpkg --get-selections | grep nginx | sed "s/\t.*//" | xargs)
The stable Nginx with the additional Brotli module is now complete.
A Need for HTTPS
Before calling a page, the web browser explains to the web server which compression it can process (Accept-Encoding
) via an HTTP GET
request. To ensure that old HTTP proxies do not trip up over Brotli compression on the way from the server to the browser, browsers only ask for Brotli compression if TLS connections (SSL) are used.
Fortunately, the free Let's Encrypt [6] certification service makes light work of configuring SSL on the web server [7]. Install the Let's Encrypt client on Debian with apt-get
:
apt-get install letsencrypt
The example does this at the www.linux-magazin.de
URL and creates a directory for it:
mkdir -p /var/www/www.linux-magazin.de/.well-known
To prove to Let's Encrypt that you have control over this domain, you first need to set up a simple HTTP server. Store the Nginx configuration (Listing 2) in the /etc/nginx/sites-available/www.linux-magazin.de
file, enable the configuration using a link to the /etc/nginx/sites-enabled/
directory, and restart Nginx:
Listing 2
Nginx HTTP Server Config
01 server { 02 listen 80; 03 server_name www.linux-magazin.de; 04 05 root /var/www/www.linux-magazin.de; 06 index index.html index.htm; 07 08 # Let's Encrypt Challenge 09 # 10 location ~ /.well-known { 11 allow all; 12 } 13 14 location / { 15 try_files $uri $uri/ =404; 16 } 17 }
ln -s /etc/nginx/sites-available/ www.linux-magazin.de /etc/nginx/sites-enabled/ www.linux-magazin.deservice nginx restart
Next, call the Let's Encrypt certbot
program with the desired URL:
certbot certonly --webroot -w /var/www/www.linux-magazin.de/.well-known -d www.linux-magazin.de
The software stores the SSL certificates in the directory structure below /etc/letsencrypt/
. Now, you can once again tackle the Nginx configuration of the web server (Listing 3) by configuring HTTPS access, enabling HTTP/2 to ensure web performance, and setting up automatic redirection of all HTTP requests to HTTPS. Now, restart Nginx:
service nginx restart
Listing 3
Nginx Web Server Config
01 # Redirection of HTTP requests to HTTPS 02 # 03 server { 04 listen 80; 05 server_name www.linux-magazin.de; 06 07 root /var/www/www.linux-magazin.de; 08 index index.html index.htm; 09 10 # Let's Encrypt Challenge 11 # 12 location ~ /.well-known { 13 allow all; 14 } 15 16 location / { 17 rewrite ^/(.*)$ https://www.linux-magazin.de/$1 permanent; 18 rewrite ^/$ https://www.linux-magazin.de/ permanent; 19 } 20 } 21 22 # HTTPS configuration 23 # 24 server { 25 listen 443 ssl http2; 26 server_name www.linux-magazin.de; 27 28 # Letsencrypt-SSL certificate 29 ssl_certificate /etc/letsencrypt/live/www.linux-magazin.de/fullchain.pem; 30 ssl_certificate_key /etc/letsencrypt/live/www.linux-magazin.de/privkey.pem; 31 32 # Cache Connection-Credentials 33 ssl_session_cache shared:SSL:20m; 34 ssl_session_timeout 180m; 35 36 root /var/www/www.linux-magazin.de/current; 37 index index.html index.htm; 38 39 # Brotli-Settings 40 # 41 brotli on; 42 brotli_comp_level 5; 43 brotli_static on; 44 brotli_types text/html text/plain text/css application/javascript application/x-javascript text/xml application/xml application/xml+rss text/javascript image/x-icon image/vnd.microsoft.icon image/bmp image/svg+xml; 45 46 location / { 47 try_files $uri $uri/ =404; 48 } 49 }
Store an HTML file in the /var/www/www.linux-magazin.de/
directory and retrieve it with Brotli compression. To change the compression performance of Brotli in on-the-fly compression, brotli_comp_level
can be set to values from 1
to 11
.
Like Gzip, using higher compression values in Brotli achieves better compression but requires more time and CPU power. Values between 4
and 6
are generally considered a good compromise. Anyone who pre-compresses static files for optimum performance and stores them with the .br
file ending will do well with a value of 11
.
Buy this article as PDF
(incl. VAT)