Hello Habr, my name is Ilya, I work in the Exness platform team. We develop and implement the basic infrastructure components that our product development teams use.
In this article, I would like to share my experience of implementing encrypted SNI (ESNI) technology in the infrastructure of public websites.
The use of this technology will increase the level of security when working with a public website and comply with the Company's internal security standards.
First of all, I want to note that the technology is not standardized and is still in the draft, however, CloudFlare and Mozilla already support it (in
Some theory
ESNI is an extension to the TLS 1.3 protocol that allows SNI to be encrypted in the TLS handshake "Client Hello" message. Here is what Client Hello looks like with ESNI support (instead of the usual SNI, we see ESNI):
To use ESNI, three things are needed:
- DNS?
- Customer support;
- Server side support.
DNS
You need to add two DNS records - Aand TXT (TXT record contains the public key with which the client can encrypt the SNI) - see below. In addition, there should be support DoH (DNS over HTTPS) because available clients (see below) do not enable ESNI support without DoH. This is logical, since ESNI implies encryption of the name of the resource we are accessing, that is, it makes no sense to access DNS over UDP. Moreover, the use
Currently available
CloudFlare
Π record:
curl 'https://dns.google.com/resolve?name=www.cloudflare.com&type=A'
-s -H 'accept: application/dns+json'
{
"Status": 0,
"TC": false,
"RD": true,
"RA": true,
"AD": true,
"CD": false,
"Question": [
{
"name": "www.cloudflare.com.",
"type": 1
}
],
"Answer": [
{
"name": "www.cloudflare.com.",
"type": 1,
"TTL": 257,
"data": "104.17.210.9"
},
{
"name": "www.cloudflare.com.",
"type": 1,
"TTL": 257,
"data": "104.17.209.9"
}
]
}
TXT record, the request is generated according to the template _esni.FQDN:
curl 'https://dns.google.com/resolve?name=_esni.www.cloudflare.com&type=TXT'
-s -H 'accept: application/dns+json'
{
"Status": 0,
"TC": false,
"RD": true,
"RA": true,
"AD": true,
"CD": false,
"Question": [
{
"name": "_esni.www.cloudflare.com.",
"type": 16
}
],
"Answer": [
{
"name": "_esni.www.cloudflare.com.",
"type": 16,
"TTL": 1799,
"data": ""/wEUgUKlACQAHQAg9SiAYQ9aUseUZr47HYHvF5jkt3aZ5802eAMJPhRz1QgAAhMBAQQAAAAAXtUmAAAAAABe3Q8AAAA=""
}
],
"Comment": "Response from 2400:cb00:2049:1::a29f:209."
}
So, in terms of DNS, we should use DoH (preferably with DNSSEC) and add two entries.
Customer support
If we are talking about browsers, then at the moment
Of course, TLS 1.3 must be used to support ESNI, as ESNI is an extension to TLS 1.3.
For the purposes of testing the ESNI-enabled backend, we implemented the client in go, But more on that later.
Server side support
Currently, ESNI is not supported by web servers like nginx/apache etc. as they work with TLS via OpenSSL/BoringSSL, where ESNI is not officially supported.
Therefore, we decided to create our own front-end component (ESNI reverse proxy) that would support TLS 1.3 termination with ESNI and proxy HTTP(S) traffic to upstream that does not support ESNI. This allows you to apply the technology in an already established infrastructure, without changing the main components - that is, using current web servers that do not support ESNI.
For clarity, here is a diagram:
I note that the proxy was conceived with the ability to terminate a TLS connection without ESNI, to support clients without ESNI. Also, the communication protocol with upstream can be either HTTP or HTTPS with a TLS version below 1.3 (if upstream does not support 1.3). This scheme gives maximum flexibility.
Implementing ESNI support on go we borrowed from
To generate ESNI keys, we used
We have tested building with go 1.13 on Linux (Debian, Alpine) and MacOS.
A few words about operational features
ESNI reverse proxy provides metrics in Prometheus format such as rps, upstream latency & response codes, failed/successful TLS handshakes & TLS handshake duration. At first glance, this seemed sufficient to evaluate how the proxy handles traffic.
We also performed load testing before use. Results below:
wrk -t50 -c1000 -d360s 'https://esni-rev-proxy.npw:443' --timeout 15s
Running 6m test @ https://esni-rev-proxy.npw:443
50 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.77s 1.21s 7.20s 65.43%
Req/Sec 13.78 8.84 140.00 83.70%
206357 requests in 6.00m, 6.08GB read
Requests/sec: 573.07
Transfer/sec: 17.28MB
We carried out load testing purely qualitative, to compare the scheme using ESNI reverse proxy and without. We "poured" traffic locally in order to eliminate "interference" in intermediate components.
So, with ESNI support and HTTP upstream proxying, we got around ~ 550 rps from one instance, while the average CPU / RAM consumption of ESNI reverse proxy is:
- 80% CPU Usage (4 vCPUs, 4 GB RAM hosts, Linux)
- 130 MB Mem RSS
For comparison, RPS for the same nginx upstream without TLS termination (HTTP protocol) ~ 1100:
wrk -t50 -c1000 -d360s 'http://lb.npw:80' β-timeout 15s
Running 6m test @ http://lb.npw:80
50 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.11s 2.30s 15.00s 90.94%
Req/Sec 23.25 13.55 282.00 79.25%
393093 requests in 6.00m, 11.35GB read
Socket errors: connect 0, read 0, write 0, timeout 9555
Non-2xx or 3xx responses: 8111
Requests/sec: 1091.62
Transfer/sec: 32.27MB
The presence of timeouts indicates that there is a lack of resources (we used 4 vCPUs, 4 GB RAM hosts, Linux), and in fact the potential RPS is higher (we got numbers up to 2700 RPS on more powerful resources).
In conclusion, I note that ESNI technology looks quite promising. There are still many open issues, such as storing the public ESNI key in DNS and rotating ESNI keys - these issues are actively discussed, and the latest version of the draft (at the time of writing) ESNI has already
Source: habr.com