Why is the Internet still online?

The Internet seems to be a strong, independent and indestructible structure. In theory, the strength of the network is enough to survive a nuclear explosion. In reality, the Internet can drop one small router. All due to the fact that the Internet is a pile of contradictions, vulnerabilities, mistakes and videos about cats. The basis of the Internet - the BGP protocol - contains a lot of problems. It's amazing he's still breathing. In addition to errors in the Internet itself, it is also broken by everyone and sundry: large Internet providers, corporations, states and DDoS attacks. What to do with it and how to live with it?

Why is the Internet still online?

Knows the answer Alexey Uchakin (night_snake) is the leader of the network engineering team at IQ Option. Its main task is the availability of the platform for users. In the transcript of Alexei's report on Saint High Load++ 2019 let's talk about BGP, DDOS attacks, internet kill switch, provider errors, decentralization, and cases when a small router sent the internet to sleep. At the end - a couple of tips on how to survive it all.

The day the internet broke

I will cite just a few incidents when connectivity broke down on the Internet. This will be enough for the complete picture.

"AS7007 Incident". The first time the Internet broke was in April 1997. There was a bug in the software of one router from the autonomous system 7007. At some point, the router announced its internal routing table to its neighbors and sent half of the network into a black hole.

"Pakistan vs YouTube". In 2008, the brave guys from Pakistan decided to block YouTube. They did it so well that half the world was left without seals.

"Capture of VISA, MasterCard and Symantec prefixes by Rostelecom". In 2017, Rostelecom mistakenly started announcing VISA, MasterCard and Symantec prefixes. As a result, financial traffic was directed through channels controlled by the provider. The leak did not last long, but the financial companies were unhappy.

"Google vs Japan". In August 2017, Google began to announce the prefixes of major Japanese providers NTT and KDDI in some of its uplinks. The traffic was sent to Google as transit, most likely by mistake. Since Google is not a provider and does not allow transit traffic, a significant part of Japan was left without the Internet.

"DV LINK has taken over Google, Apple, Facebook, Microsoft prefixes". In the same 2017, the Russian provider DV LINK, for some reason, began to announce the networks of Google, Apple, Facebook, Microsoft and some other major players.

"eNet from US has taken over AWS Route53 and MyEtherwallet prefixes". In 2018, an Ohio provider or one of its clients announced the Amazon Route53 network and the MyEtherwallet crypto wallet. The attack was successful: even despite the self-signed certificate, a warning about which appeared to the user when entering the MyEtherwallet website, many wallets hijacked and stole part of the cryptocurrency.

There were more than 2017 similar incidents in 14 alone! The network is still decentralized, so not everything and not everyone breaks. But incidents happen by the thousands, and they all involve the BGP protocol that powers the internet.

BGP and its problems

Protocol BGP - Border Gateway Protocol, was first described in 1989 by two engineers from IBM and Cisco Systems on three "napkins" - sheets of A4 format. These "napkins" still lie in Cisco Systems headquarters in San Francisco as a relic of the networked world.

The protocol is based on the interaction of autonomous systems - Autonomous Systems or abbreviated - AS. An autonomous system is simply an ID that has IP networks assigned to it in a public registry. A router with this ID can announce these networks to the world. Accordingly, any route on the Internet can be represented as a vector, which is called AS Path. The vector consists of the autonomous system numbers that must be traversed to reach the destination network.

For example, there is a network of a number of autonomous systems. You need to get from the AS65001 system to the AS65003 system. The path from one system is represented by AS Path in the diagram. It consists of two autonomous systems: 65002 and 65003. For each destination address, there is an AS Path vector, which consists of the autonomous system numbers that we need to traverse.

Why is the Internet still online?

So what are the problems with BGP?

BGP is a trust protocol

The BGP protocol is trust based. This means that we trust our neighbor by default. This is a feature of many protocols that were developed at the very dawn of the Internet. Let's figure out what "trust" means.

No neighbor authentication. Formally, there is MD5, but MD5 in 2019 - well, that's ...

No filtering. BGP has filters and they are described, but they are not used or used incorrectly. I'll explain why later.

It's very easy to establish a neighborhood. Setting up a neighborhood in the BGP protocol on almost any router is a couple of config lines.

No BGP management rights required. You do not need to take exams that will confirm your qualifications. No one will take away the rights for configuring BGP while drunk.

Two main problems

Prefix hijacks - prefix hijacks. Prefix hijacking - announcing a network that does not belong to you, as is the case with MyEtherwallet. We took some prefixes, agreed with the provider or hacked it, and through it we announce these networks.

Route leaks - route leaks. Leaks are a little more difficult. A leak is a change in AS Path. In the best case, the change will cause more delay, because you need to go through a longer route or a less capacious link. At worst, the case with Google and Japan will be repeated.

Google itself is not an operator or a transit autonomous system. But when he announced the Japanese operators to his network provider, traffic through Google on AS Path was seen as a higher priority. Traffic went there and dropped simply because the routing settings inside Google are more complicated than just filters at the border.

Why don't filters work?

nobody cares. This is the main reason - no one cares. The admin of a small provider or a company that connected to the provider via BGP took MikroTik, configured BGP on it, and does not even know that filters can be configured there.

Configuration errors. They debugged something, made a mistake in the mask, put the wrong mesh - and now, again, an error.

No technical capability. For example, telecom providers have many customers. In a smart way, you should automatically update the filters for each client - make sure that he has a new network, that he leased his network to someone. It's hard to keep track of it, even harder with your hands. Therefore, they simply put relaxed filters or do not put filters at all.

Exceptions. There are exceptions for beloved and big clients. Especially in the case of inter-operator joints. For example, TransTeleCom and Rostelecom have a lot of networks and there is a joint between them. If the joint falls, it will not be good for anyone, so the filters relax or are removed completely.

Outdated or outdated information in the IRR. Filters are built on the basis of information that is recorded in IRR - Internet Routing Registry. These are registries of regional Internet registrars. Often the registries contain outdated or irrelevant information, or both.

Who are these registrars?

Why is the Internet still online?

All addresses on the Internet belong to the organization IANA - Internet Assigned Numbers Authority. When you buy an IP network from someone, you are not buying addresses, but the right to use them. Addresses are an intangible resource and by common agreement they are all owned by IANA.

The system works like this. IANA delegates management of IP addresses and autonomous system numbers to five regional registrars. Those issue autonomous systems LIR - local Internet registrars. The LIRs then allocate IP addresses to end users.

The disadvantage of the system is that each of the regional registrars maintains its registers in its own way. Everyone has their own views on what information should be contained in the registries, who should or should not check it. The result is a mess, which is now.

How else can you deal with these problems?

IRR - mediocre quality. With IRR, it’s clear that everything is bad there.

BGP communities. This is some attribute that is described in the protocol. We can hang, for example, a special community on our announcement so that the neighbor does not send our networks to his neighbors. When we have a P2P link, we only exchange our networks. So that the route does not accidentally go to other networks, we hang community.

Community are not transitive. It is always a contract for two, and this is their disadvantage. We can't hang any community, except for one, which is accepted by default by everyone. We cannot be sure that this community will be accepted and correctly interpreted by everyone. Therefore, in the best case, if you agree with your uplink, he will understand what you want from him in the community. But his neighbor may not understand, or the operator will simply reset your label, and you will not achieve what you wanted.

RPKI + ROA solves only a small part of the problems. RPKI is Resource Public Key Infrastructure  - a special framework for signing routing information. It's a good idea to get LIRs and their clients to keep an up-to-date address space database. But there is one problem with him.

RPKI is also a hierarchical public key system. Does IANA have a key from which RIR keys are generated, and from them LIR keys? with which they sign their address space using ROAs - Route Origin Authorizations:

- I assure you that this prefix will be announced on behalf of this autonomy.

In addition to ROA, there are other objects, but about them somehow later. Seems to be good and useful. But it does not protect us from leaks from the word β€œabsolutely” and does not solve all problems with prefix hijacking. Therefore, players are not in a hurry to implement it. Although there are already assurances from large players like AT&T and large IXs that prefixes with an invalid ROA record will drop.

Perhaps they will do it, but so far we have a huge number of prefixes that are not signed in any way. On the one hand, it is unclear whether they are validly announced. On the other hand, we cannot drop them by default, because we are not sure whether this is correct or not.

What else is there?

BGPsec. It's a cool thing the academics came up with for the pink pony chain. They said:

- We have RPKI + ROA - a mechanism for verifying the signature of the address space. Let's create a separate BGP attribute and call it BGPSec Path. Each router will sign the announcements that it announces to its neighbors with its signature. This way we will get a trusted path from the chain of signed announcements and will be able to check it.

Good in theory, but a lot of problems in practice. BGPSec breaks many of the existing BGP mechanics of next-hop selection and incoming/outgoing traffic control directly on the router. BGPSec does not work until it is implemented by 95% of the participants in the entire market, which in itself is a utopia.

BGPSec has huge performance issues. On the current hardware, the speed of checking announcements is about 50 prefixes per second. For comparison: the current Internet table of 700 prefixes will be filled in for 000 hours, during which it will change 5 more times.

BGP Open Policy (Role-based BGP). Fresh proposal based on the model Gao Rexford. These are two scientists who are doing BGP research.

The Gao-Rexford model is as follows. To simplify, in the case of BGP, there are a small number of types of interactions:

  • Provider Customer;
  • P2P;
  • internal communication, let's say iBGP.

Based on the role of the router, it is already possible to hang some import / export policies by default. The administrator does not need to configure prefix lists. Based on the role that the routers agree on and can be set, we already get some default filters. This is now a draft that is being discussed in the IETF. I hope that soon we will see this in the form of an RFC and implementation on hardware.

Major ISPs

Consider the example of a provider CenturyLink. It is the third largest ISP in the US, serving 37 states and operating 15 data centers. 

In December 2018, CenturyLink lay on the US market for 50 hours. During the incident, there were problems with the operation of ATMs in two states, and 911 was out of service for several hours in five states. Lottery busted in Idaho. The US Telecommunications Commission is currently investigating the incident.

The cause of the tragedy in one network card in one data center. The card failed, sent incorrect packets, and all 15 data centers of the provider lay down.

Why is the Internet still online?

Idea didn't work for this provider "too big to fall". This idea doesn't work at all. You can take any major player and put in some small change. In the US, everything is still good with connectedness. CenturyLink customers who had a reserve went into it en masse. Then alternative operators complained about the overload of their links.

If the conditional Kazakhtelecom falls down, the whole country will be left without the Internet.

Corporations

Probably Google, Amazon, FaceBook and other corporations hold the Internet? No, they break it too.

In 2017 in St. Petersburg at the ENOG13 conference Geoff Huston of APNIC presented report "Death of transit". It says that we are used to the fact that interactions, money flows and traffic on the Internet are vertical. We have small providers that pay for connectivity to larger ones, and those are already paying for connectivity to global transit.

Why is the Internet still online?

Now we have such a vertically oriented structure. Everything would be fine, but the world is changing - major players are building their own transoceanic cables to build their own backbones.

Why is the Internet still online?
News about CDN cable.

In 2018, TeleGeography released a study that more than half of the traffic on the Internet is no longer the Internet, but the backbones of the CDN of large players. This is traffic that is related to the Internet, but this is no longer the network that we talked about.

Why is the Internet still online?

The Internet breaks up into a large set of loosely connected networks.

Microsoft has its own network, Google has its own, and they overlap very little. Traffic that originated somewhere in the USA goes through Microsoft channels across the ocean to Europe somewhere on a CDN, then through CDN or IX it connects with your provider and gets to your router.

Decentralization is disappearing.

This strength of the Internet, which will help it survive after a nuclear explosion, is being lost. There are places of concentration of users and traffic. If the conditional Google Cloud falls, there will be many victims at once. We partly felt this when Roskomnadzor blocked AWS. And on the example of CenturyLink, it can be seen that little things are enough for this.

Previously, not everything and not everyone broke. In the future, we may come to the conclusion that by influencing one major player, we can break a lot of things, a lot of places and a lot of people.

States

The states are next in line, and this is usually what happens to them.

Why is the Internet still online?

Here, our Roskomnadzor is not even a pioneer at all. A similar practice of Internet shutdown exists in Iran, India, and Pakistan. In England, there is a bill on the possibility of shutting down the Internet.

Any large state wants to get a switch to turn off the Internet, either completely or in parts: Twitter, Telegram, Facebook. It’s not that they don’t understand that they will never succeed, but they really want it. The knife switch is used, as a rule, for political purposes - to eliminate political competitors, or the elections are on the nose, or Russian hackers have broken something again.

DDoS attacks

I will not take away bread from my comrades from Qrator Labs, they do it much better than me. They have Annual report Internet stability. And here is what they wrote in their 2018 report.

Average duration of DDoS attacks drops to 2.5 hours. Attackers also begin to count money, and if the resource does not lay down immediately, then it is quickly left alone.

Increasing intensity of attacks. In 2018, we saw 1.7 Tb/s on the Akamai network, and this is not the limit.

New attack vectors emerge and old ones intensify. There are new protocols that are subject to amplification, there are new attacks on existing protocols, especially TLS and the like.

Most of the traffic is mobile devices. At the same time, Internet traffic is transferred to mobile clients. You need to be able to work with this both for those who attack and for those who defend themselves.

Invincible - no. This is the main idea - there is no universal protection that will definitely protect against any DDoS.

The system cannot be put down unless it is connected to the internet.

I hope I scared you enough. Let's now think about what to do with it.

What to do?!

If you have free time, desire and knowledge of English, participate in working groups: IETF, RIPE WG. These are open mail-lists, subscribe to mailing lists, participate in discussions, come to conferences. If you have LIR status, you can vote, for example, in RIPE for various initiatives.

For mere mortals it is monitoring. To know what's broken.

Monitoring: what to check?

Regular Ping, and not only binary check - it works or not. Record RTT in history to watch anomalies later.

traceroute. It is a utility for determining data paths in TCP/IP networks. Helps to detect anomalies and blockages.

HTTP checks custom URL and TLS certificates will help detect blocking or DNS spoofing for an attack, which is almost the same thing. Blocking is often done by DNS spoofing and wrapping traffic to a stub page.

If possible, then from your clients check resolve your origin from different places, if you have an application. This way you will detect DNS interception anomalies, which is sometimes the fault of providers.

Monitoring: where to check?

There is no universal answer. Check where the user comes from. If users are in Russia, check from Russia, but don't limit yourself to it. If your users live in different regions, check from these regions. But the best from all over the world.

Monitoring: what to check?

I came up with three ways. If you know more - write in the comments.

  • RIPE Atlas.
  • Commercial monitoring.
  • Own network of virtual machines.

Let's talk about each of them.

RIPE Atlas It's such a small box. For those who know the domestic "Inspector" - this is the same box, but with a different sticker.

Why is the Internet still online?

RIPE Atlas is a free program. You register, receive a router by mail and turn it on to the network. For the fact that someone else uses your test, you get some credits. With these loans, you can do some research yourself. You can test in different ways: ping, traceroute, check certificates. The coverage is quite large, a lot of nodes. But there are nuances.

The credit system does not allow building production solutions. There will not be enough credits for permanent research or commercial monitoring. Credits are enough for a short study or one-time check. The daily norm from one sample is eaten by 1-2 checks.

Coating is uneven. Since the program is free in both directions, the coverage is good in Europe, in the European part of Russia and some regions. But if you need Indonesia or New Zealand, then everything is much worse - 50 samples per country may not be available.

Can't check http from probe. This is due to technical nuances. They promise to fix it in the new version, but so far http cannot be checked. You can check only the certificate. Some kind of http check can only be done before a special RIPE Atlas device called Anchor.

The second way is commercial monitoring. Everything is fine with him, you pay the money, don't you? They promise you several tens or hundreds of monitoring points around the world, draw beautiful dashboards out of the box. But, again, there are problems.

It's paid, sometimes very. Ping monitoring, checks from all over the world and many http checks can cost several thousand dollars a year. If finances allow and you like this solution - please.

Coverage may not be available in the region of interest. The same ping specifies the maximum abstract part of the world - Asia, Europe, North America. Rare monitoring systems can detail a sample to a specific country or region.

Weak support for custom tests. If you need something custom, and not just a "curl" on the url, then this is also a problem.

The third way is your own monitoring. This is a classic: "Let's write our own!"

Its monitoring turns into the development of a software product, and a distributed one. You are looking for an infrastructure provider, see how to deploy and monitor it - monitoring needs to be monitored, right? And it still needs support. Think ten times before doing this. It might be easier to pay someone to do it for you.

Monitoring BGP anomalies and DDoS attacks

Here, according to the available resources, it is still easier. BGP anomalies are detected using specialized services such as QRadar, BGPmon. They accept a full view table from multiple operators. Based on what they see from different operators, they can detect anomalies, look for amplifiers, and so on. Usually registration is free - fill in your autonomy number, subscribe to email notifications, and the service will alert your problems.

In monitoring DDoS attacks, everything is also simple. As a rule, this NetFlow-based and logs. There are specialized systems FastNetMon, modules for Splunk. As a last resort, there is your DDoS protection provider. He can also merge NetFlow and, based on it, he will notify you of attacks in your direction.

Conclusions

Have no illusions - the Internet is bound to break. Not everything and not everyone will break, but 14 thousand incidents in 2017 hint that there will be incidents.

Your job is to spot problems as early as possible.. At least not later than your user. Not only that, you should always keep a plan B in mind. A plan is a strategy for what you will do when everything goes wrong: reserve operators, DC, CDN. The plan is a separate checklist, according to which you check the work of everything. The plan should work without the involvement of network engineers, because they are usually few and sleepy.

That's all. I wish you high availability and green monitoring.

Sun, highload and a high concentration of developers are expected in Novosibirsk next week High Load++ Siberia 2019. In Siberia, a front of reports about monitoring, availability and testing, security and management is predicted. Precipitation is expected in the form of written notes, networking, photos and posts on social networks. We recommend postponing all cases on June 24 and 25 and to book tickets. We are waiting for you in Siberia!

Source: habr.com

Add a comment