How ELK helps security engineers fight website attacks and sleep peacefully

Our cyber defense center is responsible for the security of the client's web infrastructure and repels attacks on client sites. To protect against attacks, we use FortiWeb Web Application Firewalls (WAFs). But even the coolest WAF is not a panacea and does not protect "out of the box" from targeted attacks. 

Therefore, in addition to WAF, we use from ELK. It helps to collect all events in one place, accumulates statistics, visualizes it and allows us to see a targeted attack in time.

Today I’ll tell you in more detail how we crossed the Christmas tree with WAF and what came of it.

How ELK helps security engineers fight website attacks and sleep peacefully

The story of one attack: how everything worked before switching to ELK

In our cloud, the customer has deployed the application behind our WAF. From 10 to 000 users connected to the site per day, the number of connections reached 100 million per day. Of these, 000-20 users were intruders and tried to hack the site. 

The usual form brute force from one IP address was blocked by FortiWeb quite easily. The number of hits to the site per minute was higher than that of legitimate users. We simply set up activity thresholds from one address and repelled the attack.

It is much more difficult to deal with "slow attacks", when attackers act slowly and disguise themselves as ordinary clients. They use many unique IP addresses. Such activity did not look like massive brute force to WAF, it was more difficult to track it automatically. And there was also the risk of blocking normal users. We looked for other signs of an attack and set up a policy for automatically blocking IP addresses based on this sign. For example, many illegitimate sessions had common fields in the http request headers. You often had to look for such fields manually in the FortiWeb event logs. 

It got long and uncomfortable. In the standard functionality of FortiWeb, events are recorded in text in 3 different logs: detected attacks, information about requests, and system messages about WAF operation. Dozens or even hundreds of attack events can come in a minute.

Not so much, but you have to manually climb through several logs and iterate over many lines: 

How ELK helps security engineers fight website attacks and sleep peacefully
In the attack log, we see user addresses and the nature of the activity. 
 
It is not enough to simply scan the log table. To find the most interesting and useful about the nature of the attack, you need to look inside a specific event:

How ELK helps security engineers fight website attacks and sleep peacefully
Highlighted fields help to detect "slow attack". Source: screenshot from Fortinet website

Well, the main problem is that only a FortiWeb specialist can figure it out. If during business hours we could still track suspicious activity in real time, then the investigation of night incidents could be delayed. When the FortiWeb policies didn't work for some reason, the night shift engineers on duty couldn't assess the situation without access to the WAF and woke up the FortiWeb specialist. We looked through the logs for several hours and found the moment of the attack. 

With such volumes of information, it is difficult to understand the big picture at a glance and act proactively. Then we decided to collect data in one place in order to analyze everything in a visual form, find the beginning of the attack, identify its direction and method of blocking. 

What did you choose from

First of all, we looked at solutions already in use, so as not to multiply entities unnecessarily.

One of the first options was Nagioswhich we use to monitor engineering infrastructure, network infrastructure, emergency alerts. The security guards also use it to notify the attendants in case of suspicious traffic, but it does not know how to collect disparate logs and therefore disappears. 

There was an option to aggregate everything with MySQL and PostgreSQL or another relational database. But in order to pull out the data, it was necessary to sculpt your application. 

As a log collector in our company they also use Forti Analyzer from Fortinet. But in this case, he also did not fit. Firstly, it is more sharpened to work with a firewall Fortigate. Secondly, many settings were missing, and interaction with it required excellent knowledge of SQL queries. And thirdly, its use would increase the cost of the service for the customer.   

This is how we came to open source in the face from ELK

Why choose ELK 

ELK is a set of open source programs:

  • Elasticsearch - a database of time series, which was just created to work with large volumes of text;
  • logstash – a data collection mechanism that can convert logs to the desired format; 
  • kibana - a good visualizer, as well as a fairly friendly interface for managing Elasticsearch. You can use it to build schedules that can be monitored by duty engineers at night. 

The entry threshold for ELK is low. All basic features are free. What else is needed for happiness.

How did you put it all together in one system?

Created indexes and left only the necessary information. We loaded all three FortiWEB logs into ELK - the output was indexes. These are files with all collected logs for a period, for example, a day. If we visualized them right away, we would only see the dynamics of the attacks. For details, you need to β€œfall through” into each attack and look at specific fields.

How ELK helps security engineers fight website attacks and sleep peacefully

We realized that first we need to set up the parsing of unstructured information. We took long fields as strings, such as "Message" and "URL", and parsed them to get more information for decision making. 

For example, using parsing, we took out the user's location separately. This helped immediately highlight attacks from abroad on sites for Russian users. By blocking all connections from other countries, we reduced the number of attacks by 2 times and could easily deal with attacks inside Russia. 

After parsing, they began to look for what information to store and visualize. Leaving everything in the log was inappropriate: the size of one index was large - 7 GB. ELK took a long time to process the file. However, not all information was useful. Something was duplicated and took up extra space - it was necessary to optimize. 

At first, we simply looked through the index and removed unnecessary events. This turned out to be even more inconvenient and longer than working with logs on FortiWeb itself. The only plus from the "Christmas tree" at this stage is that we were able to visualize a large period of time on one screen. 

We did not despair, we continued to eat the cactus and study ELK and believed that we would be able to extract the necessary information. After cleaning the indexes, we began to visualize what is. So we came to big dashboards. We poked widgets - visually and elegantly, a real ЁLKa! 

How ELK helps security engineers fight website attacks and sleep peacefully

Captured the moment of the attack. Now it was necessary to understand how the beginning of the attack looks on the chart. To detect it, we looked at the server's responses to the user (return codes). We were interested in server responses with such codes (rc): 

Code (rc)

Name

Description

0

DROP

The request to the server is blocked

200

Ok

Request processed successfully

400

Bad Request

Bad Request

403

Forbidden

Authorization denied

500

Internal Server Error

Service is unavailable

If someone started to attack the site, the ratio of codes changed: 

  • If there were more erroneous requests with code 400, and the same number of normal requests with code 200, then someone was trying to hack the site. 
  • If, at the same time, requests with code 0 also grew, then the FortiWeb politicians also "saw" the attack and applied blocks to it. 
  • If the number of messages with code 500 increased, then the site is not available for these IP addresses - also a kind of blocking. 

By the third month, we had set up a dashboard to track this activity.

How ELK helps security engineers fight website attacks and sleep peacefully

In order not to monitor everything manually, we set up integration with Nagios, which polled ELK at certain intervals. If it recorded the achievement of threshold values ​​by codes, it sent a notification to the duty officers about suspicious activity. 

Combined 4 charts in the monitoring system. Now it was important to see on the graphs the moment when the attack is not blocked and the intervention of an engineer is needed. On 4 different graphs, our eye was blurred. Therefore, we combined the charts and began to observe everything on one screen.

On monitoring, we watched how graphs of different colors change. A burst of red indicated that the attack had begun, while the orange and blue graphs showed FortiWeb's reaction:

How ELK helps security engineers fight website attacks and sleep peacefully
Everything is fine here: there was a surge of "red" activity, but FortiWeb coped and the attack schedule came to naught.

We also drew for ourselves an example of a graph that requires intervention:

How ELK helps security engineers fight website attacks and sleep peacefully
Here we can see that FortiWeb has increased activity, but the red attack graph has not decreased. You need to change the WAF settings.

Investigating night incidents has also become easier. The graph immediately shows the moment when it is time to come to the defense of the site. 

How ELK helps security engineers fight website attacks and sleep peacefully
That's what sometimes happens at night. Red graph - the attack has begun. Blue - FortiWeb activity. The attack was not completely blocked, we had to intervene.

Where are we going

Now we are training duty administrators to work with ELK. The attendants learn to assess the situation on the dashboard and make a decision: it's time to escalate to a FortiWeb specialist, or the policies on the WAF will be enough to automatically repel the attack. This is how we reduce the workload of security engineers at night and share support roles at the system level. Access to FortiWeb remains only with the cyber defense center, and only they make changes to the WAF settings when urgently needed.

We are also working on reporting for customers. We plan that data on the dynamics of WAF work will be available in the client's personal account. ELK will make the situation clearer without the need to refer to the WAF itself.

If the customer wants to monitor their own protection in real time, ELK will also come in handy. We cannot give away access to WAF, since the customer's intervention in the work may affect the rest. But you can pick up a separate ELK and give it to β€œplay around”. 

These are the scenarios for using the Christmas tree that we have accumulated lately. Share your thoughts on this and don't forget set everything up correctlyto avoid database leaks. 

Source: habr.com