Flow protocols as a tool for monitoring the security of an internal network

When it comes to monitoring the security of an internal corporate or departmental network, many associate it with information leakage control and the implementation of DLP solutions. And if you try to refine the question and ask how you detect attacks on the internal network, then the answer will usually be mention of intrusion detection systems (IDS). And what was the only option 10-20 years ago is becoming an anachronism today. There is a more effective, and in some places the only possible option for monitoring the internal network - to use flow protocols, originally designed to search for network problems (troubleshooting), but over time transformed into a very interesting security tool. Here we will talk about what flow protocols are and which ones help to detect network attacks better, where it is best to implement flow monitoring, what to look for when deploying such a scheme, and even how to “raise” it all on domestic equipment, we will talk within the scope of this article.

I will not dwell on the question “Why do we need to monitor the security of the internal infrastructure?” The answer to that is kind of obvious. But if, nevertheless, you would like to make sure once again that today you can’t do without it, look a short video with a story about how you can get into a corporate network protected by a firewall in 17 ways. Therefore, we will assume that we understand that internal monitoring is a necessary thing and it remains only to understand how it can be organized.

I would single out three key data sources for infrastructure monitoring at the network level:

  • “raw” traffic that we capture and submit for analysis to some analysis systems,
  • events from network devices through which traffic passes,
  • traffic information received via one of the flow protocols.

Flow protocols as a tool for monitoring the security of an internal network

The capture of raw traffic is the most popular option among security people, because it historically appeared and was the very first. Conventional network intrusion detection systems (the very first commercial intrusion detection system was NetRanger from the Wheel Group, bought in 1998 by Cisco) were just engaged in capturing packets (and later sessions) in which certain signatures were searched (“decisive rules” in the terminology of the FSTEC), signaling attacks. Of course, you can analyze raw traffic not only with IDS, but also with other tools (for example, Wireshark, tcpdum, or the NBAR2 functionality in Cisco IOS), but they usually lack the knowledge base that distinguishes an information security tool from a regular IT tool.

So, intrusion detection systems. The oldest and most popular method for detecting network attacks, which does a good job at the perimeter (no matter what - corporate, data center, segment, etc.), but fails in modern switched and software-defined networks. In the case of a network built on the basis of conventional switches, the infrastructure of intrusion detection sensors becomes too large - you will have to put a sensor on each connection to the host that you want to monitor attacks against. Any manufacturer, of course, will be happy to sell you hundreds and thousands of sensors, but I think your budget cannot withstand such expenses. I can say that even at Cisco (and we are the developers of NGIPS) we could not do this, although, it would seem, the issue of price is before us. should not stand - this is our own decision. In addition, the question arises, how to connect the sensor in this version? Into a gap? What if the sensor itself fails? Require a bypass module in the sensor? Use splitters (tap)? All this increases the cost of the solution and makes it unaffordable for a company of any size.

Flow protocols as a tool for monitoring the security of an internal network

You can try to “hang” the sensor on the SPAN/RSPAN/ERSPAN port and direct traffic to it from the necessary switch ports. This option partially removes the problem described in the previous paragraph, but poses another one - the SPAN port cannot receive absolutely all the traffic that will be sent to it - it will not have enough bandwidth. Willing to sacrifice something. Either leave some of the nodes without monitoring (then you first need to prioritize them), or send not all traffic from the node, but only a certain type. In any case, we can miss some attacks. In addition, the SPAN port can be occupied for other needs. As a result, we will have to review the existing network topology and possibly make adjustments to it in order to cover your network to the maximum with the number of sensors you have (and coordinate this with IT).

What if your network uses asymmetric routes? And if you have implemented or plan to implement SDN? But what if you need to monitor virtualized machines or containers whose traffic does not reach the physical switch at all? These are questions that traditional IDS vendors don't like because they don't know how to answer them. Perhaps they will persuade you that all these fashionable technologies are hype and you do not need it. Perhaps they will talk about the need to start small. Or maybe they will say that you need to put a powerful thresher in the center of the network and direct all traffic to it using load balancers. Whatever option you are offered, you yourself need to clearly understand how it suits you. And only after that make a decision on the choice of approach to monitoring the information security of the network infrastructure. Returning to packet capture, I want to say that this method continues to be very popular and important, but its main purpose is border control; the boundaries between your organization and the Internet, the boundaries between the data center and the rest of the network, the boundaries between the process control system and the corporate segment. In these places, classic IDS / IPS still have the right to exist and do a good job with their tasks.

Flow protocols as a tool for monitoring the security of an internal network

Let's move on to the second option. Analysis of events coming from network devices can also be used for intrusion detection purposes, but not as the main mechanism, since it only detects a small class of intrusions. In addition, some reactivity is inherent in it - an attack must first occur, then it must be fixed by a network device, which in one way or another will signal a problem with information security. There are several such methods. It can be syslog, RMON or SNMP. The last two protocols for network monitoring in the context of information security are used only if we need to detect a DoS attack on the network equipment itself, since using RMON and SNMP you can, for example, monitor the load on the device's central processor or its interfaces. This is one of the “cheapest” (everyone has syslog or SNMP), but also the most inefficient of all ways to monitor the information security of the internal infrastructure - many attacks are simply hidden from it. Of course, they should not be neglected, and the same syslog analysis helps you to identify changes in the configuration of the device itself in a timely manner, compromising it, but it is not very suitable for detecting attacks on the entire network.

The third option is to analyze information about traffic passing through a device that supports one of several flow protocols. In this case, regardless of the protocol, the streaming infrastructure necessarily consists of three components:

  • Generation or export flow. This role is usually assigned to a router, switch or other network device, which, passing network traffic through itself, allows you to extract key parameters from it, which are then transmitted to the collection module. For example, Cisco's Netflow protocol is supported not only on routers and switches, including both virtual and industrial ones, but also on wireless controllers, firewalls, and even servers.
  • collection flow. Considering that there is usually more than one network device in a modern network, the problem of collecting and consolidating streams arises, which is solved with the help of so-called collectors that process the received streams and then transmit them for analysis.
  • flow analysis. The analyzer takes on the main intellectual task and, applying various algorithms to the streams, draws certain conclusions. For example, within an IT function, such an analyzer can identify network bottlenecks or analyze the traffic load profile to further optimize the network. And for information security, such an analyzer can detect data leaks, the spread of malicious code, or DoS attacks.

Do not think that such a three-tier architecture is too complicated - all other options (with the possible exception of network monitoring systems that work with SNMP and RMON) also work according to it. We have a data generator for analysis, which is a network device or a stand-alone sensor. We have an alarm collection system and we have a management system for the entire monitoring infrastructure. The last two components can be combined within a single node, but in more or less large networks they are usually spread over at least two devices in order to ensure scalability and reliability.

Flow protocols as a tool for monitoring the security of an internal network

Unlike packet analysis, which is based on the study of the header and data body of each packet and the sessions consisting of them, flow analysis relies on the collection of metadata about network traffic. When, how much, from where and where, how ... these are the questions that network telemetry analysis answers using various flow protocols. Initially, they were used to analyze statistics and search for IT problems in the network, but then, as the development of analytical mechanisms, it became possible to apply them to the same telemetry for security purposes. It is worth reiterating here that flow analysis does not replace or replace packet capture. Each of these methods has its own scope. But in the context of this article, it is flow analysis that is best suited for monitoring internal infrastructure. You have network devices (whether they operate in a software-defined paradigm or according to static rules) that an attack cannot bypass. It can bypass the classic IDS sensor, but not a network device that supports the flow protocol. This is the advantage of this method.

On the other hand, if you need an evidence base for law enforcement or your own incident investigation team, you can't do without packet capture - network telemetry is not a copy of the traffic that can be used to collect evidence; it is needed for rapid detection and decision-making in the field of information security. On the other hand, using telemetry analysis, you can “write” not all network traffic (if anything, Cisco is also involved in data centers :-), but only the one that is involved in the attack. Telemetry analysis tools in this regard will complement traditional packet capture mechanisms well, giving a command for selective capture and storage. Otherwise, you will have to have a colossal storage infrastructure.

Imagine a network running at 250 Mbps. If you want to save all this volume, then you will need 31 MB of storage for one second of traffic transfer, 1,8 GB for one minute, 108 GB for one hour, and 2,6 TB for one day. To store daily data from a network with a bandwidth of 10 Gb / s, you will need 108 TB of storage. But some regulators require you to store security data for years ... The on-demand recording that flow analysis helps you to reduce these values ​​\u1b\u500bis orders of magnitude. By the way, if we talk about the ratio of the volume of recorded data of network telemetry and full data capture, then it is approximately 5 to 216. For the same values ​​\uXNUMXb\uXNUMXbgiven above, the storage of a complete decryption of all daily traffic will be XNUMX and XNUMX GB, respectively (you can even write to a regular flash drive ).

If the method of capturing raw network data for analysis tools is almost the same from vendor to vendor, then in the case of stream analysis, the situation is different. There are several variants of flow protocols, the differences in which you need to know in the context of security. The most popular is the Netflow protocol developed by Cisco. There are several versions of this protocol that differ in their capabilities and the amount of information recorded about traffic. The current version is the ninth (Netflow v9), from which the industry standard Netflow v10, also known as IPFIX, was developed. Today, most network vendors support Netflow or IPFIX in their equipment. But there are various other variants of flow protocols - sFlow, jFlow, cFlow, rFlow, NetStream, etc., of which sFlow is the most popular. It is he who is most often supported by domestic manufacturers of network equipment due to ease of implementation. What are the key differences between Netflow, as a de facto standard, and the same sFlow? I would highlight a few key ones. First, Netflow has user-configurable fields as opposed to fixed fields in sFlow. And secondly, and this is the most important thing in our case, sFlow collects the so-called sampled telemetry; unlike unsampled Netflow and IPFIX. What is the difference between them?

Flow protocols as a tool for monitoring the security of an internal network

Imagine that you decide to read the book “Security Operations Center: Building, Operating, and Maintaining your SOC” my colleagues Gary McIntyre, Joseph Munitz and Nadem Alfardan (you can download part of the book from the link). You have three options to achieve your goal - read the book in its entirety, skim through it, stopping at every 10th or 20th page, or try to find a retelling of key concepts in a blog or service like SmartReading. So, non-sampled telemetry is the reading of each “page” of network traffic, that is, the analysis of metadata for each packet. Sampled telemetry is a selective study of traffic in the hope that the selected samples will be what you need. Depending on the channel speed, the sampled telemetry will send every 64th, 200th, 500th, 1000th, 2000th or even 10000th packet for analysis.

Flow protocols as a tool for monitoring the security of an internal network

In the context of information security monitoring, this means that sampled telemetry is well suited for detecting DDoS attacks, scanning, spreading malicious code, but it can miss atomic or multi-packet attacks that are not included in the sample sent for analysis. Unswept telemetry has no such shortcomings with it either. using the range of detected attacks is much wider. Here is a small list of events that can be detected using network telemetry analysis tools.

Flow protocols as a tool for monitoring the security of an internal network

Of course, some open source Netflow analyzer will not allow you to do this, since its main task is to collect telemetry and perform basic analysis on it from an IT point of view. To identify IS threats based on flow, it is necessary to equip the analyzer with various engines and algorithms, which will identify cybersecurity problems based on standard or custom Netflow fields, enrich standard data with external data from various Threat Intelligence sources, etc.

Flow protocols as a tool for monitoring the security of an internal network

Therefore, if you have a choice, then stop it on Netflow or IPFIX. But even if your equipment only works with sFlow, like domestic manufacturers, then even in this case you can benefit from it in a security context.

Flow protocols as a tool for monitoring the security of an internal network

In the summer of 2019, I analyzed the opportunities that Russian network iron manufacturers have, and all of them, excluding NSG, Polygon and Craftway, announced support for sFlow (at least Zelaks, Natex, Eltex, QTech, Rusteletech).

Flow protocols as a tool for monitoring the security of an internal network

The next question that will arise before you is where to implement flow support for security purposes? Actually, the question is put not absolutely correctly. On modern equipment, there is almost always support for flow protocols. Therefore, I would reformulate the question differently - where is the most efficient way to collect telemetry from a security point of view? The answer will be quite obvious - at the access level, where you will see 100% of all traffic, where you will have detailed information on hosts (MAC, VLAN, interface ID), where you can track even P2P traffic between hosts, which is critical for scanning detection and distribution of malicious code. At the core level, you may simply not see some of the traffic, but at the perimeter level you will see well if a quarter of your network traffic. But if for some reason you have extraneous devices on your network that allow attackers to “enter and exit” bypassing the perimeter, then analyzing the telemetry from it will not give you anything. Therefore, for maximum coverage, it is recommended to enable telemetry collection at the access level. At the same time, it is worth noting that even if we are talking about virtualization or containers, modern virtual switches also often support flow, which allows you to control traffic there as well.

But since I raised the topic, then I need to answer the question, what if, after all, the equipment, physical or virtual, does not support flow protocols? Or is its inclusion forbidden (for example, in industrial segments to ensure reliability)? Or does turning it on cause high CPU usage (this happens on older hardware)? To solve this problem, there are specialized virtual sensors (flow sensor), which are essentially ordinary splitters that pass traffic through themselves and broadcast it in the form of a flow to the collection module. True, in this case we get the whole heap of problems that we talked about above in relation to packet capture tools. That is, it is necessary to understand not only the advantages of flow analysis technology, but also its limitations.

One more point that is important to remember when talking about flow analysis tools. If we apply the EPS (event per second, events per second) metric in relation to the usual means of generating security events, then this indicator is not applicable to telemetry analysis; it is replaced by FPS (flow per second, flow per second). As in the case of EPS, it cannot be calculated in advance, but it is possible to estimate the approximate number of threads that a particular device generates depending on its task. On the Internet, you can find tables with approximate values ​​​​for different types of enterprise devices and conditions, which will allow you to estimate what kind of licenses you need for analysis tools and what will their architecture be? The fact is that the IDS sensor is limited to a certain bandwidth, which it will “pull out”, and the flow collector has its own limitations that must be understood. Therefore, in large, geographically distributed networks, there are usually several collectors. When I described how the network is monitored inside Cisco, I have already given the number of our collectors - there are 21 of them. And this is for a network scattered across five continents and numbering about half a million active devices).

Flow protocols as a tool for monitoring the security of an internal network

We use our own solution as a Netflow monitoring system Cisco Stealth Watch, which is specifically focused on solving security problems. It has many built-in engines for detecting anomalous, suspicious and obviously malicious activity, which allows you to detect a wide range of different threats - from cryptomining to information leaks, from the distribution of malicious code to fraud. Like most flow analyzers, Stealthwatch is built on a three-level scheme (generator - collector - analyzer), but it is supplemented with a number of interesting features that are important in the context of the material under consideration. First, it integrates with packet capture solutions (such as Cisco Security Packet Analyzer), which allows you to record selected network sessions for later in-depth investigation and analysis. Secondly, specifically to expand security tasks, we have developed a special nvzFlow protocol that allows you to “broadcast” application activity on end nodes (servers, workstations, etc.) into telemetry and transfer it to the collector for further analysis. If in its original version Stealthwatch works with any flow protocol (sFlow, rFlow, Netflow, IPFIX, cFlow, jFlow, NetStream) at the network level, then nvzFlow support allows data to be correlated also at the node level, thereby. improving overall system efficiency and seeing more attacks than conventional network flow analyzers.

It is clear that when talking about Netflow analysis systems from a security point of view, the market is not limited to a single solution from Cisco. You can use both commercial and free or shareware solutions. It is rather strange if I cite competitors' solutions on the Cisco blog, so I will say a few words about how network telemetry can be analyzed using two popular, similar in name, but still different tools - SiLK and ELK.

SiLK is a set of tools (the System for Internet-Level Knowledge) for traffic analysis developed by the American CERT / CC and which supports, in the context of today's article, Netflow (5th and 9th, the most popular versions), IPFIX and sFlow and using various utilities (rwfilter, rwcount, rwflowpack, etc.) to perform various operations on network telemetry in order to detect signs of unauthorized actions in it. But there are a couple of important things to note. SiLK is a command line tool and perform online analysis, all the while typing a command of the form (detection of ICMP packets larger than 200 bytes):

rwfilter --flowtypes=all/all --proto=1 --bytes-per-packet=200- --pass=stdout | rwrwcut --fields=sIP,dIP,iType,iCode --num-recs=15

not very comfortable. You can use the iSiLK GUI, but it won't make your life much easier by only solving the visualization function, not the analyst replacement. And this is the second point. Unlike commercial solutions, which already have a solid analytical base, anomaly detection algorithms corresponding to workflow, etc., in the case of SiLK, you will have to do all this yourself, which will require slightly different competencies from you than from using already ready-to-use toolkit. This is not good and not bad - this is a feature of almost any free tool that comes from the fact that you know what to do, and it will only help you with this (commercial tools are less dependent on the competencies of its users, although it also assumes that analysts understand at least basics of conducting network investigations and monitoring). But back to SiLK. The analyst work cycle with it is as follows:

  • Formulation of a hypothesis. We must understand what we will be looking for inside network telemetry, know the unique attributes by which we will identify certain anomalies or threats.
  • Model building. Having formulated a hypothesis, we program it using the same Python, shell, or other tools that are not included in SiLK.
  • Testing. It's time to check the correctness of our hypothesis, which is confirmed or refuted using the SiLK utilities starting with 'rw', 'set', 'bag'.
  • Analysis of real data. In commercial operation, SiLK helps us to identify something and the analyst must answer the questions “Did we find what we expected?”, “Does this correspond to our hypothesis?”, “How will it reduce the number of false positives?”, “How to improve the level of recognition? » and so on.
  • Improvement. At the final stage, we improve what was done earlier - we create templates, improve and optimize the code, reformulate and refine the hypothesis, etc.

This cycle will be applicable to the same Cisco Stealthwatch, only the last of these five steps automates to the maximum, reducing the number of analyst errors and increasing the efficiency of incident detection. For example, in SiLK, you can enrich network statistics with external data on malicious IPs using your own scripts, and in Cisco Stealthwatch, this is a built-in function that immediately displays an alarm for you if interaction with blacklisted IP addresses occurs in network traffic.

If you go up the pyramid of “paid” flow analysis software, then the absolutely free SiLK will be followed by a shareware ELK, consisting of three key components - Elasticsearch (indexing, searching and analyzing data), Logstash (data input / output) and Kibana ( visualization). Unlike SiLK, where you have to write everything yourself, ELK already has many ready-made libraries / modules (some are paid, some are not) that automate the analysis of network telemetry. For example, the GeoIP filter in Logstash allows you to bind the monitored IP addresses to their geographic location (the same Stealthwatch has this built-in function).

Flow protocols as a tool for monitoring the security of an internal network

ELK also has a fairly large community that adds the missing components to this monitoring solution. For example, to work with Netflow, IPFIX and sFlow you can use the module elastic flowif you are not satisfied with the Logstash Netflow Module that only supports Netflow.

Giving more efficiency in collecting flow and searching in it, ELK currently lacks rich built-in analytics for detecting anomalies and threats in network telemetry. That is, following the life cycle described above, you will have to independently describe violation models and then use it in the combat system (there are no built-in models).

Flow protocols as a tool for monitoring the security of an internal network

Of course, there are more sophisticated extensions for ELK, which already contain some models for detecting anomalies in network telemetry, but such extensions cost money and the question is whether the game is worth the candle - write a similar model yourself, buy its implementation for your monitoring tool or buy a turnkey solution of the Network Traffic Analysis class.

Flow protocols as a tool for monitoring the security of an internal network

In general, I don’t want to go into controversy about whether it’s better to spend money and buy a ready-made solution for monitoring anomalies and threats in network telemetry (for example, Cisco Stealthwatch) or figure it out on your own and tweak the same SiLK, ELK or nfdump or OSU Flow Tools for each new threat ( I'm talking about the last two of them I told last time)? Everyone chooses for themselves and everyone has their own motives for choosing either of the two options. I just wanted to show that network telemetry is a very important tool in ensuring the network security of your internal infrastructure and you should not neglect it, so as not to add to the list a company whose name is mentioned in the media along with the epithets “hacked”, “non-compliant with information security requirements "," not thinking about the security of their data and customer data.

Flow protocols as a tool for monitoring the security of an internal network

Summing up, I would like to list the key tips that you should follow when building information security monitoring of your internal infrastructure:

  1. Don't limit yourself to just the perimeter! Use (and choose) network infrastructure not only to transfer traffic from point A to point B, but also to solve cybersecurity issues.
  2. Study the existing information security monitoring mechanisms in your network equipment and use them.
  3. For internal monitoring, give preference to telemetry analysis - it allows you to detect up to 80-90% of all network information security incidents, while doing what is impossible when capturing network packets and saving storage space for all information security events.
  4. To monitor flows, use Netflow v9 or IPFIX - they give more information in the context of security and allow you to monitor not only IPv4, but also IPv6, MPLS, etc.
  5. Use an unsampled flow protocol - it provides more information for detecting threats. For example, Netflow or IPFIX.
  6. Check the load of your network equipment - it may not be able to handle the processing of the flow protocol as well. Then consider using virtual sensors or a Netflow Generation Appliance.
  7. Implement control in the first place at the access level - this will give you the opportunity to see 100% of all traffic.
  8. If you have no choice and you are using Russian network equipment, then choose one that supports flow protocols or has SPAN / RSPAN ports.
  9. Combine intrusion/attack detection/prevention at the borders and flow analysis systems in the internal network (including in the clouds).

Flow protocols as a tool for monitoring the security of an internal network

As for the last tip, I would like to give an illustration that I have already given before. You can see that if earlier the Cisco IS service almost entirely built its IS monitoring system based on intrusion detection systems and signature methods, now they account for only 20% of incidents. Another 20% is accounted for by flow analysis systems, which suggests that these solutions are not a whim, but a real tool in the activities of the information security services of a modern enterprise. Moreover, you have the most important thing for their implementation - the network infrastructure, investments in which can be additionally protected by assigning information security monitoring functions to the network.

Flow protocols as a tool for monitoring the security of an internal network

I deliberately did not touch on the topic of responding to anomalies or threats detected in network flows, but I think that it is clear that monitoring should not end only with the detection of a threat. It should be followed by a response and preferably in automatic or automated mode. But this is a topic for a separate article.

Additional Information:

PS. If it’s easier for you to listen to everything that was written above, then you can watch the hour-long presentation that formed the basis of this note.



Source: habr.com

Add a comment