[Don't] use a CDN

Almost every article or tool for optimizing site speed has a modest “use a CDN” clause. In general, CDN is a content delivery network or content delivery network. We at Method Lab often meet with clients' questions on this topic, some of them independently include a CDN. The purpose of this article is to understand what a CDN can give in terms of site loading speed, what problems may arise, and in what cases the use of a CDN is justified.

[Don't] use a CDN

The delays circled in the picture are caused by the use of a CDN.

A bit of history

Like many technologies, CDNs appeared out of necessity. With the development of Internet channels among Internet users, online video services have appeared. Naturally, video content requires orders of magnitude more bandwidth than regular website content (images, text, and CSS or JS code).

When you try to simultaneously broadcast a video stream to many clients from one server, the server's Internet channel will most likely become the bottleneck. As a rule, several thousand threads are enough to clog a typical server channel. Of course, there may be other resource limits, but they are not important now. It is also important that expanding the server channel is too expensive (and sometimes impossible), and inappropriate. The load on the channel during broadcasts will be cyclical.

The problem of limiting the channel of an individual server is perfectly solved by CDN. Clients do not connect directly to the server, but to the nodes of the CDN network. In an ideal situation, the server gives one stream to the CDN node, and then the network uses its own resources to deliver this stream to many users. From an economic point of view, we pay only for the resources actually consumed (this can be bandwidth or traffic) and we get excellent scalability of our service. Using a CDN to deliver heavy content is fully justified and logical. Although it's worth noting that the biggest players in the field (like Netflix) are building their own CDNs instead of using large commercial CDNs (Akamai, Cloudflare, Fastly, etc.)

As the web has evolved, web applications themselves have become more complex and heavier. The problem of download speed came to the fore. Website speed enthusiasts were quick to point out a few of the major issues that were causing websites to load slowly. One of them was network delays (RTT - round trip time or ping time). Delays affect many processes in site loading: establishing a TCP connection, starting a TLS session, loading each individual resource (image, JS file, HTML document, etc.)

The problem was exacerbated by the fact that when using the HTTP/1.1 protocol (before the advent of SPDY, QUIC and HTTP/2, this was the only option), browsers open no more than 6 TCP connections to a single host. All this led to connection downtime and inefficient use of channel bandwidth. The problem was partially solved by domain sharding - the creation of additional hosts to overcome the limit on the number of connections.

This is where the second ability of CDN comes in - reducing delays (RTT) due to the large number of points and the proximity of the nodes to the user. Distance plays a decisive role here: the speed of light is limited (about 200 km/sec in fiber optics). This means that every 000 km of the path adds 1000 ms of delay, or 5 ms in RTT. This is the minimum time spent on transmission, as there are still delays in the intermediate equipment. Since CDNs are usually able to cache objects on their servers, we can benefit from loading such objects through the CDN. Prerequisites for this: the presence of the object in the cache, the proximity of the CDN point to the user in comparison with the web application server (origin server). It is important to understand that the geographical proximity of a CDN node does not guarantee low latency. Routing between the client and the CDN can be built in such a way that the client will connect to a host in another country, and possibly on another continent. This is where the relationship between telecom operators and the CDN service comes into play (peering, availability of joints, participation in IX, etc.) and the traffic routing policy of the CDN itself. For example, Cloudflare, when using two initial plans (free and cheap), does not guarantee the delivery of content from the nearest node - the choice of the host will be made to achieve the minimum cost.

Many leading Internet companies are attracting public interest (web developers and service owners) to the topic of loading speed and site performance. Among these companies are Yahoo (Yslow tool), AOL (WebPageTest) and Google (Page Speed ​​Insights service), which develop their recommendations for speeding up sites (primarily they relate to client-side optimization). More recently, new site speed testing tools have been released that also provide tips for improving site speed. In each of these services or plugins, there is an invariable recommendation "Use a CDN". The reduction in network delays is usually cited as an explanation for the CDN effect. Unfortunately, not everyone is ready to understand exactly how the CDN acceleration effect is achieved and how it can be measured, so the recommendation is taken on faith and used as a postulate. In fact, not all CDNs are equally useful.

Using a CDN Today

To assess the usefulness of using CDNs, they need to be classified. What can be found now in practice (examples in brackets, of course, are not exhaustive):

  1. Free CDN for distributing JS libraries (MaxCDN, Google. Yandex).
  2. CDN of client-side optimization services (for example, Google Fonts for fonts, Cloudinary, Cloudimage for images).
  3. CDN for statics and resource optimization in CMS (available in Bitrix, WordPress and others).
  4. General purpose CDN (StackPath, CDNVideo, NGENIX, Megafon).
  5. CDN for website acceleration (Cloudflare, Imperva, Airee).

The key difference between these types is the following: what part of the traffic goes through the CDN. Types 1-3 are delivery of only part of the content: from one request to several dozen (usually pictures). Types 4 and 5 are full traffic proxying via CDN.

In practice, this means the number of connections that are used to load the site. With HTTP/2, we use a single TCP connection to the host to handle any number of requests. If we divide resources into the main host (origin) and CDN, then it is necessary to distribute requests across several domains and create several TCP connections. In the worst case, this is: DNS (1 RTT) + TCP (1 RTT) + TLS (2-3 RTT) = 6-7 RTT. This formula does not take into account delays in mobile networks to activate the radio channel of the device (if it was not active) and delays in the cell tower.

Here's what it looks like on the site loading waterfall (delays are highlighted for connecting to the CDN at 150 ms RTT):

[Don't] use a CDN

If the CDN covers all site traffic (except for third-party services), then we can use a single TCP connection, saving connection delays to additional hosts. Of course, this applies to HTTP/2 connections.

Further differences are determined by the functionality of a particular CDN - for the first type, this is just hosting a static file, for the fifth, it is changing several types of site content in order to optimize.

CDN features for website acceleration

Let's describe the full range of CDN capabilities for speeding up sites, without looking at the functionality of individual CDN types, and then see what is implemented in each of them.

1. Compressing text resources

The most basic feature and clear feature, however, is often poorly implemented. The presence of compression is declared by all CDNs as their acceleration feature. But if you look in more detail, then the shortcomings are revealed:

  • low rates for dynamic compression can be used - 5-6 (for example, for gzip the maximum is 9);
  • in static compression (files in the cache) no additional features are used (for example, zopfi or brotli with a power of 11)
  • no support for efficient brotli compression (about 20% savings compared to gzip).

If you use a CDN, you should check these few points: take the file that came from the CDN, fix its size in compressed form and manually compress it for comparison (you can use some online service with brotli support, for example vsoszhat.rf).

2. Setting client-side caching headers

Also a simple acceleration feature: set headers for content caching by the client (browser). The most relevant header is cache-control, obsolete is expires. Additionally, Etag can be used. The main thing is that the max-age of cache-control should be large enough (from a month or more), if you are ready to cache the resource as hard as possible, you can add the immutable option.

CDNs can lower the max-age value, forcing the user to re-download statics more often. What is the reason for this: with the desire to increase traffic on the network or increase compatibility with sites that do not know how to reset the cache - it is not clear. For example, the default cache time value in Cloudflare headers is 1 hour, which is very low for immutable static.

3. Image optimization

Since the CDN takes on the functions of caching and serving images, it would be logical to optimize them on the CDN side and give them to users in this form. Let's make a reservation right away, this feature is available only for CDN types 2, 3 and 5.

There are many ways to optimize images: using advanced compression formats (such as WebP), more efficient encoders (MozJPEG), or simply cleaning up redundant metadata.

In general, there are two types of such optimizations: with loss of quality and without loss of quality. CDNs usually aim to use lossless optimization in order to avoid possible customer complaints about changing image quality. In such conditions, the gain will be minimal. In reality, often the JPEG quality level is much higher than necessary, and you can safely recompress with a lower quality score without compromising the user experience. On the other hand, it is difficult to determine the level of quality and settings universally for all possible web applications, so CDNs use more conservative settings compared to those that can be applied taking into account the context (image assignment, web application type, etc.)

4. TLS connection optimization

Most traffic today is transmitted over TLS connections, which means that we spend extra time on TLS negotiation. Recently, new technologies have been developed to accelerate this process. For example, these are EC cryptography, TLS 1.3, session cache and tickets (session tickets), hardware-accelerated encryption (AES-NI), etc. Proper TLS settings can reduce connection time to 0-1 RTT (not counting DNS and TCP ).

In the presence of modern software, it is not difficult to implement such practices at our own facilities.

Not all CDNs implement TLS best practices, you can check this by measuring the TLS connection time (for example, in Webpagetest). Ideal for a new connection - 1RTT, 2RTT - average, 3RTT and more - bad.

It should also be noted that even when using TLS at the CDN level, the server with our web application must also process TLS, but from the CDN side, because the traffic between the server and the CDN passes in the public network. In the worst case, we will get double TLS connection delays (the first to the CDN host, the second between it and our server).

For some applications, it is worth paying attention to security issues: traffic is usually decrypted at CDN nodes, and this is a potential opportunity for traffic interception. The option of working without disclosing traffic is usually offered in top tariff plans for a fee.

5. Reduce connection delays

The main benefit of a CDN that everyone is talking about is low latency (smaller distance) between the CDN host and the user. Achieved by creating a geographically distributed network architecture in which hosts are located at user concentration points (cities, traffic exchange points, etc.)

In practice, priorities for different networks may be located in specific regions. For example, Russian CDNs will have more points of presence in Russia. The American ones will primarily develop the network in the USA. For example, one of the largest CDN Cloudflare has only 2 points in Russia - Moscow and St. Petersburg. That is, we can save about 10 ms latency as much as possible compared to direct placement in Moscow.

Most Western CDNs don't have outlets in Russia at all. By connecting to them, you can only increase the delays for your Russian audience.

6. Content optimization (minification, structural changes)

The most complex and technological item. Changing content on delivery can be very risky. Even if we take minification: reducing the source code (due to extra spaces, unimportant constructs, etc.) can affect its performance. If we talk about more serious changes - moving JS code to the end of HTML, merging files and the like - the risk of breaking the functionality of the site is even higher.

Therefore, only a few type 5 CDNs do this. Of course, it will not be possible to automate all the changes necessary for acceleration - manual analysis and optimization is required. For example, removing unused or duplicate code is just a manual task.

As a rule, all such optimizations are controlled by settings and the most dangerous ones are disabled by default.

Support for accelerating capabilities by CDN type

So, let's see what kind of potential acceleration opportunities provide different types of CDN.

For convenience, we repeat the classification.

  1. Free CDN for distributing JS libraries (MaxCDN, Google. Yandex).
  2. CDN of client-side optimization services (for example, Google Fonts for fonts, Cloudinary, Cloudimage for images).
  3. CDN for statics and resource optimization in CMS (available in Bitrix, WordPress and others).
  4. General purpose CDN (StackPath, CDNVideo, NGENIX, Megafon).
  5. CDN for website acceleration (Cloudflare, Imperva, Airee).

Now let's compare the features and types of CDN.

Possibility
Type 1
Type 2
Type 3
Type 4
Type 5

Text compression
+–
-
+–
+–
+

Cache headers
+
+
+
+
+

Pictures
-
+–
+–
-
+

TLS
-
-
-
+–
+

Delays
-
-
-
+
+

Contents
-
-
-
-
+

In this table, "+" is used to indicate full support, "-" - no support, "+-" - partial support. Of course, deviations from this table are possible in reality (for example, some general-purpose CDN will implement image optimization features), but it is useful for a general idea.

Results

Hopefully, after reading this article, you will have a clearer picture of the “use a CDN” recommendation to speed up your sites.

As in any business, you can not trust the marketing promises of any service. The effect must be measured and tested in real conditions. If you are already using some kind of CDN, check it for effectiveness according to the criteria described in the article.

It's possible that using a CDN right now is slowing your site down.

As a general recommendation, we can dwell on the following: study your audience, determine its geographical scope. If your main audience is concentrated within a radius of 1-2 thousand kilometers, you do not need a CDN for its main purpose - reducing delays. Instead, you can place your server closer to your users and set it up properly, getting most of the optimizations described in this article (free and permanent).

In case your audience is really geographically distributed (with a radius of more than 3000 kilometers), using a quality CDN will really be useful. However, you need to understand in advance what exactly your CDN can accelerate (see the table of possibilities and their description). At the same time, website acceleration still remains a complex task that cannot be solved by connecting a CDN. In addition to these optimizations, the CDN leaves out the most effective means of acceleration: server-side optimization, advanced changes in the client-side (removal of unused code, optimization of the rendering process, work with content, fonts, adaptability, etc.)

Source: habr.com

Add a comment