How to secure your public site with ESNI

Hello Habr, my name is Ilya, I work in the Exness platform team. We develop and implement the basic infrastructure components that our product development teams use.

In this article, I would like to share my experience of implementing encrypted SNI (ESNI) technology in the infrastructure of public websites.

How to secure your public site with ESNI

The use of this technology will increase the level of security when working with a public website and comply with the Company's internal security standards.

First of all, I want to note that the technology is not standardized and is still in the draft, however, CloudFlare and Mozilla already support it (in draft01). This motivated us to do this experiment.

Some theory

ESNI is an extension to the TLS 1.3 protocol that allows SNI to be encrypted in the TLS handshake "Client Hello" message. Here is what Client Hello looks like with ESNI support (instead of the usual SNI, we see ESNI):

How to secure your public site with ESNI

 To use ESNI, three things are needed:

  • DNS? 
  • Customer support;
  • Server side support.

DNS

You need to add two DNS records - Aand TXT (TXT record contains the public key with which the client can encrypt the SNI) - see below. In addition, there should be support DoH (DNS over HTTPS) because available clients (see below) do not enable ESNI support without DoH. This is logical, since ESNI implies encryption of the name of the resource we are accessing, that is, it makes no sense to access DNS over UDP. Moreover, the use DNSSEC allows you to protect yourself from "cache poisoning" attacks in this scenario.

Currently available multiple DoH providers, among them:

CloudFlare declares (Check My Browser β†’ Encrypted SNI β†’ Learn More) that their servers already support ESNI, that is, we have at least two records in DNS for CloudFlare servers - A and TXT. In the example below, we query Google DNS (over HTTPS): 

А record:

curl 'https://dns.google.com/resolve?name=www.cloudflare.com&type=A' 
-s -H 'accept: application/dns+json'
{
  "Status": 0,
  "TC": false,
  "RD": true,
  "RA": true,
  "AD": true,
  "CD": false,
  "Question": [
    {
      "name": "www.cloudflare.com.",
      "type": 1
    }
  ],
  "Answer": [
    {
      "name": "www.cloudflare.com.",
      "type": 1,
      "TTL": 257,
      "data": "104.17.210.9"
    },
    {
      "name": "www.cloudflare.com.",
      "type": 1,
      "TTL": 257,
      "data": "104.17.209.9"
    }
  ]
}

TXT record, the request is generated according to the template _esni.FQDN:

curl 'https://dns.google.com/resolve?name=_esni.www.cloudflare.com&type=TXT' 
-s -H 'accept: application/dns+json'
{
  "Status": 0,
  "TC": false,
  "RD": true,
  "RA": true,
  "AD": true,
  "CD": false,
  "Question": [
    {
    "name": "_esni.www.cloudflare.com.",
    "type": 16
    }
  ],
  "Answer": [
    {
    "name": "_esni.www.cloudflare.com.",
    "type": 16,
    "TTL": 1799,
    "data": ""/wEUgUKlACQAHQAg9SiAYQ9aUseUZr47HYHvF5jkt3aZ5802eAMJPhRz1QgAAhMBAQQAAAAAXtUmAAAAAABe3Q8AAAA=""
    }
  ],
  "Comment": "Response from 2400:cb00:2049:1::a29f:209."
}

So, in terms of DNS, we should use DoH (preferably with DNSSEC) and add two entries. 

Customer support

If we are talking about browsers, then at the moment support implemented only in FireFox. Here provides instructions on how to enable ESNI and DoH support in FireFox. After the browser is configured, we should see something like this:

How to secure your public site with ESNI

Link to check the browser.

Of course, TLS 1.3 must be used to support ESNI, as ESNI is an extension to TLS 1.3.

For the purposes of testing the ESNI-enabled backend, we implemented the client in go, But more on that later.

Server side support

Currently, ESNI is not supported by web servers like nginx/apache etc. as they work with TLS via OpenSSL/BoringSSL, where ESNI is not officially supported.

Therefore, we decided to create our own front-end component (ESNI reverse proxy) that would support TLS 1.3 termination with ESNI and proxy HTTP(S) traffic to upstream that does not support ESNI. This allows you to apply the technology in an already established infrastructure, without changing the main components - that is, using current web servers that do not support ESNI. 

For clarity, here is a diagram:

How to secure your public site with ESNI

I note that the proxy was conceived with the ability to terminate a TLS connection without ESNI, to support clients without ESNI. Also, the communication protocol with upstream can be either HTTP or HTTPS with a TLS version below 1.3 (if upstream does not support 1.3). This scheme gives maximum flexibility.

Implementing ESNI support on go we borrowed from CloudFlare. I note right away that the implementation itself is quite non-trivial, since it involves changes in the standard library crypto/tls and therefore requires "patching" GOROOT before assembly.

To generate ESNI keys, we used esnitool (also the brainchild of CloudFlare). These keys are used to encrypt/decrypt the SNI.
We have tested building with go 1.13 on Linux (Debian, Alpine) and MacOS. 

A few words about operational features

ESNI reverse proxy provides metrics in Prometheus format such as rps, upstream latency & response codes, failed/successful TLS handshakes & TLS handshake duration. At first glance, this seemed sufficient to evaluate how the proxy handles traffic. 

We also performed load testing before use. Results below:

wrk -t50 -c1000 -d360s 'https://esni-rev-proxy.npw:443' --timeout 15s
Running 6m test @ https://esni-rev-proxy.npw:443
  50 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.77s     1.21s    7.20s    65.43%
    Req/Sec    13.78      8.84   140.00     83.70%
  206357 requests in 6.00m, 6.08GB read
Requests/sec:    573.07
Transfer/sec:     17.28MB 

We carried out load testing purely qualitative, to compare the scheme using ESNI reverse proxy and without. We "poured" traffic locally in order to eliminate "interference" in intermediate components.

So, with ESNI support and HTTP upstream proxying, we got around ~ 550 rps from one instance, while the average CPU / RAM consumption of ESNI reverse proxy is:

  • 80% CPU Usage (4 vCPUs, 4 GB RAM hosts, Linux)
  • 130 MB Mem RSS

How to secure your public site with ESNI

For comparison, RPS for the same nginx upstream without TLS termination (HTTP protocol) ~ 1100:

wrk -t50 -c1000 -d360s 'http://lb.npw:80' –-timeout 15s
Running 6m test @ http://lb.npw:80
  50 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.11s     2.30s   15.00s    90.94%
    Req/Sec    23.25     13.55   282.00     79.25%
  393093 requests in 6.00m, 11.35GB read
  Socket errors: connect 0, read 0, write 0, timeout 9555
  Non-2xx or 3xx responses: 8111
Requests/sec:   1091.62
Transfer/sec:     32.27MB 

The presence of timeouts indicates that there is a lack of resources (we used 4 vCPUs, 4 GB RAM hosts, Linux), and in fact the potential RPS is higher (we got numbers up to 2700 RPS on more powerful resources).

In conclusion, I note that ESNI technology looks quite promising. There are still many open issues, such as storing the public ESNI key in DNS and rotating ESNI keys - these issues are actively discussed, and the latest version of the draft (at the time of writing) ESNI has already 7.

Source: habr.com

Add a comment