Hello colleagues! Today I would like to discuss a very relevant topic for many Check Point administrators, "CPU and RAM Optimization". It is not uncommon for a gateway and/or management server to consume unexpectedly many of these resources, and one would like to understand where they βleakβ and, if possible, use them more competently.
1. Analysis
To analyze the processor load, it is useful to use the following commands, which are entered in expert mode:
top shows all processes, the amount of CPU and RAM resources consumed in percent, uptime, process priority and
cpwd_admin list Check Point WatchDog Daemon, which shows all appline modules, their PID, status, and number of runs
cpstat -f cpu os CPU usage, their number and distribution of processor time in percent
cpstat -f memory os use of virtual RAM, how much active, free RAM and more
The correct remark is that all cpstat commands can be viewed using the utility cpview. To do this, you just need to enter the cpview command from any mode in the SSH session.
ps auxwf a long list of all processes, their ID, occupied virtual memory and memory in RAM, CPU
Another variation of the command:
ps-aF show the most expensive process
fw ctl affinity -l -a distribution of cores for different instances of the firewall, that is, CoreXL technology
fw ctl pstat RAM analysis and general indicators of connections, cookies, NAT
free -m RAM buffer
The team deserves special attention. netsat and its variations. For example, netstat -i can help solve the problem of monitoring clipboards. The parameter, RX dropped packets (RX-DRP) in the output of this command tends to grow by itself due to illegitimate protocol drops (IPv6, Bad / Unintended VLAN tags, and others). However, if drops happen for another reason, then you should use this
If the Monitoring blade is enabled, you can view these metrics graphically in the SmartConsole by clicking on an object and selecting Device & License Information.
It is not recommended to enable the Monitoring blade on an ongoing basis, but it is quite possible for a day for a test.
Moreover, you can add more parameters for monitoring, one of them is very useful - Bytes Throughput (appline bandwidth).
If there is some other monitoring system, for example, free
2. RAM βleaksβ over time
Often the question arises that over time, the gateway or management server begins to consume more and more RAM. I want to reassure you: this is a normal story for Linux-like systems.
Looking at the command output free -m ΠΈ cpstat -f memory os on the appline from expert mode, you can calculate and view all the parameters related to RAM.
Based on the available memory on the gateway at the moment Free Memory + Buffers Memory + Cached Memory = +-1.5 GB, usually.
As SR says, over time the gateway/management server gets optimized and uses more and more memory, up to about 80% usage, and stops. You can reboot the device and then the indicator will be reset. 1.5 GB of free RAM is definitely enough for the gateway to perform all tasks, and management rarely reaches such threshold values.
Also, the output of the mentioned commands will show how much you have low-memory (RAM in user space) and high-memory (RAM in kernel space) used.
Kernel processes (including active modules such as Check Point kernel modules) only use Low memory. However, user processes can use both Low and High memory. Moreover, Low memory is approximately equal to total Memory.
You should only worry if there are errors in the logs "modules reboot or processes being killed to reclaim memory due to OOM (Out of memory)". Then you should reboot the gateway and contact support if the reboot does not help.
A full description can be found in
3. Optimization
Below are questions and answers about CPU and RAM optimization. You should answer them honestly to yourself and listen to the recommendations.
3.1. Was the upline chosen correctly? Was there a pilot project?
Despite competent sizing, the network could simply grow, and this equipment simply cannot cope with the load. The second option, if there was no sizing as such.
3.2. Is HTTPS inspection enabled? If so, is the technology configured according to Best Practice?
Refer to
The order of the rules in the HTTPS inspection policy is of great importance in optimizing the opening of HTTPS sites.
Recommended order of rules:
- Bypass rules with categories/URLs
- inspect rules with categories/URLs
- Inspect rules for all other categories
By analogy with firewall policy, Check Point looks for a packet match from top to bottom, so bypass rules are best placed at the top, since the gateway will not waste resources on running through all the rules if this packet needs to be skipped.
3.3 Are address-range objects used?
Objects with a range of addresses, such as the network 192.168.0.0-192.168.5.0, consume significantly more RAM than 5 network objects. In general, it is considered good practice to delete unused objects in the SmartConsole, since each time a policy is set, the gateway and management server spend resources and, most importantly, time to verify and apply the policy.
3.4. How is the Threat Prevention policy configured?
First of all, Check Point recommends moving IPS to a separate profile and creating separate rules for this blade.
For example, an administrator thinks that a DMZ segment should only be protected with IPS. Therefore, in order for the gateway not to waste resources on processing packets by other blades, it is necessary to create a rule specifically for this segment with a profile in which only IPS is enabled.
Regarding setting up profiles, it is recommended to set it up according to the best practices in this
3.5. How many signatures in Detect mode in IPS settings?
It is recommended to work hard on signatures in the sense that unused signatures should be disabled (for example, signatures for the operation of Adobe products require a lot of computing power, and if the customer does not have such products, it makes sense to disable signatures). Then put Prevent instead of Detect where possible, because the gateway spends resources on processing the entire connection in Detect mode, in Prevent mode it immediately drops the connection and does not waste resources on the full processing of the packet.
3.6. What files are processed by the Threat Emulation, Threat Extraction, Anti-Virus blades?
It makes no sense to emulate and analyze extension files that your users do not download or you consider unnecessary on your network (for example, bat, exe files can be easily blocked using the Content Awareness blade at the firewall level, so gateway resources will be spent less). Moreover, in the Threat Emulation settings, you can select the Environment (operating system) to emulate threats in the sandbox and install Environment Windows 7 when all users are working with the 10th version, it also does not make sense.
3.7. Are the firewall and Application layer rules placed according to best practice?
If a rule has a lot of hits (matches), then it is recommended to put them at the very top, and rules with a small number of hits - at the very bottom. The main thing is to make sure that they do not intersect and do not overlap each other. Recommended firewall policy architecture:
Explanation:
First Rules - the rules with the most matches are placed here
Noise Rule - a rule for dropping spurious traffic such as NetBIOS
Stealth Rule - prohibition of access to gateways and management to all, except for those sources that were specified in the Authentication to Gateway Rules
Clean-Up, Last and Drop Rules are usually combined into one rule to prohibit everything that was not allowed before
Best practice data is described in
3.8. What are the settings for the services created by administrators?
For example, some TCP service is being created on a specific port, and it makes sense to uncheck βMatch for Anyβ in the Advanced settings of the service. In this case, this service will fall specifically under the rule in which it appears, and will not participate in the rules where Any is in the Services column.
Speaking of services, it's worth mentioning that sometimes it's necessary to tweak timeouts. This setting will allow you to use the gateway resources more intelligently, so as not to keep extra TCP / UDP session time for protocols that do not need a large timeout. For example, in the screenshot below, I changed the domain-udp service timeout from 40 seconds to 30 seconds.
3.9. Is SecureXL used and what is the percentage of acceleration?
You can check the quality of SecureXL with the main commands in expert mode on the gateway fwaccel stat ΠΈ fw accelstats -s. Next, you need to figure out what kind of traffic is accelerating, what templates (templates) you can create more.
By default, Drop Templates are not enabled, enabling them will have a positive effect on the operation of SecureXL. To do this, go to the gateway settings and the Optimizations tab:
Also, when working with a cluster, to optimize the CPU, you can disable the synchronization of non-critical services, such as UDP DNS, ICMP, and others. To do this, go to the service settings β Advanced β Synchronize connections of State Synchronization is enabled on the cluster.
All Best Practices are described in
3.10. How is CoreXl used?
CoreXL technology, which allows you to use multiple CPUs for firewall instances (firewall modules), definitely helps to optimize device performance. Team first fw ctl affinity -l -a will show the used firewall instances and the processors given over to the needed SND (a module that distributes traffic to firewall entities). If not all processors are involved, they can be added with the command cpconfig at the gateway.
Also a good story is to put
In conclusion, I would like to say that these are far from all the Best Practices for optimizing Check Point, but the most popular. If you would like to request an audit of your security policy or resolve a Check Point issue, please contact [email protected].
Thank you for attention!
Source: habr.com