Our team is very happy to share the news that a free open source monitoring system has been released.
Is version 4.2 the answer to the ultimate question of life, the universe, and monitoring in general? Let's get a look!
Recall that Zabbix is a universal system for monitoring the performance and availability of servers, engineering and network equipment, applications, databases, virtualization systems, containers, IT services, web services.
Zabbix implements a full cycle from collecting data, processing and transforming it, analyzing the received data, and ending with storing this data, visualizing and sending alerts using escalation rules. The system also provides flexible options for expanding methods for collecting data and alerts, as well as automation options through the API. A single web interface implements centralized management of monitoring configurations and distribution of access rights to different user groups. The project code is freely distributed under a license
Zabbix 4.2 is a new non-LTS version with a shortened official support period. For users who are looking for a long life cycle of software products, we recommend using LTS versions such as 3.0 and 4.0.
So, let's talk about the new features and major improvements in version 4.2:
More official platforms
In addition to the already existing official packages, we also offer new builds for:
- Raspberry Pi, Mac OS/X, SUSE Enterprise Linux Server 12
- MSI for Windows Agent
- Docker images
Built-in Prometheus support for application monitoring
Zabbix can collect data in various ways (push/pull) from different data sources. These are JMX, SNMP, WMI, HTTP/HTTPS, RestAPI, XML Soap, SSH, Telnet, agents and scripts and other sources. Now meet Prometheus support!
Strictly speaking, collecting data from Prometheus exporters was possible before thanks to the HTTP/HTTPS item type and regular expressions.
However, the new version allows you to work with Prometheus as efficiently as possible due to the built-in support for the PromQL query language. And the use of dependent metrics allows you to collect and process data most efficiently: once you apply for data, and then we decompose them according to the necessary metrics.
It is important to note that low-level discovery can now use the collected data to automatically generate metrics. In this case, Zabbix will convert the received data into JSON format, which is very convenient to work with.
At the moment there are more
Efficient high frequency monitoring
Do we want to detect problems as quickly as possible? Of course, no doubt! Most often, this approach leads to the fact that we need to poll devices and collect data too often, which leads to a greater load on the monitoring system. How to avoid it?
We have implemented the throttling mechanism in the preprocessing rules. Throtling, in fact, gives us the ability to skip the same values.
Suppose we are monitoring the state of a critical application. Every second we check whether our application is functioning or not. In this case, Zabbix receives a continuous stream of data from 1 (working) and 0 (not working). For example: 1111111111110001111111111111…
When everything is in order with our application, then Zabbix receives a stream of only ones. Do they need to be processed? In general, no, because we are only interested in changing the state of the application, we do not want to collect and store so much data. So, throttling allows you to skip a value if it is identical to the previous one. As a result, we will receive only data about the state change, for example, 01010101 ... Quite enough information to detect problems!
Missing values are simply ignored by Zabbix, they are not recorded in the history and do not affect triggers in any way. From Zabbix point of view, there are no missing values.
Great! Now we can poll devices very frequently, while instantly detecting problems without storing unnecessary information in the database.
But what about charts? They will be empty due to lack of data! And how to understand if Zabbix collects data if most of this data is skipped?
We thought about that too! Zabbix offers another type of throttling, throttling with checkpoints (throttling with heartbeat).
In this case, Zabbix, despite the repeated data flow, will store at least one value in the specified time interval. If data is collected once per second, and the interval is set to one minute, then Zabbix will turn the every second stream of ones into a every minute stream. It is easy to see that this leads to a 60-fold compression of the received data.
Now we are sure that the data is being collected, the nodata() trigger function is working and everything is fine with the graphs!
Collected data validation and error handling
None of us wants to collect erroneous or unreliable data. For example, we know that a temperature sensor should return data between 0°C and 100°C and any other value should be treated as erroneous and/or ignored.
Now this is possible with the help of data validation rules built into preprocessing for matching or not matching regular expressions, value ranges, JSONPath and XMLPath.
Now we can control the reaction to the error. If the temperature is out of range, then we can simply ignore such a value, set the default value (for example, 0°C ), or define our own error message, for example, "Sensor damaged" or "Replace battery."
A good use case for validation is to be able to check the input for the presence of an error message and set that error for the entire metric. This is a very useful feature when getting data from external APIs.
Any data transformation with JavaScript
If the built-in preprocessing rules are not enough for us, now we offer complete freedom using arbitrary JavaScript scripts!
This opens up unlimited possibilities for processing incoming data. The practical benefit of this functionality is that now we do not need external scripts that we used for any data operations. Now all this can be done with JavaScript.
Data transformation, aggregation, filters, arithmetic and logical operations and much more are now possible!
Testing preprocessing
Now we don't have to guess how our complex preprocessing scripts work. There was a convenient check of correctness of work of preprocessing directly from the interface!
We process millions of metrics per second!
Prior to Zabbix 4.2, preprocessing was done exclusively by the Zabbix server, which limited the ability to use a proxy for load balancing.
Starting with Zabbix 4.2, we get incredibly efficient load scaling due to preprocessing support on the proxy side. Now proxies are doing it!
In combination with throttling, this approach allows you to perform high-frequency, large-scale monitoring and perform millions of checks per second without loading the central Zabbix server. Proxies process huge amounts of data, while only a small part of them reaches the Zabbix server due to throttling, one or two orders of magnitude less.
Easier low-level discovery
Recall that low-level discovery (LLD) is a very powerful mechanism for automatically discovering any kind of monitoring resources (file systems, processes, applications, services, etc.) and automatically creating data elements, triggers, network nodes based on them. and other objects. This is incredibly time-saving, simplifies configuration, and allows you to use one template for hosts with different monitoring resources.
Low-level discovery required specially formatted JSON as input. That's it, it won't happen again!
Zabbix 4.2 allows low-level discovery (LLD) to use free-form data in JSON format. Why is it important? This allows, without resorting to scripts, to communicate, for example, with external APIs and use the information received to automatically create hosts, data elements and triggers.
Together with JavaScript support, this creates fantastic opportunities for creating templates for working with various data sources, such as, for example, cloud APIs, application APIs, data in XML formats, CSV, and so on and so forth.
The possibilities are truly endless!
TimescaleDB support
What is TimescaleDB? This is a regular PostgreSQL plus an extension module from the TimescaleDB team. TimescaleDB promises better performance through more efficient algorithms and data structures.
In addition, another advantage of TimescaleDB is the automatic partitioning of tables with history. TimescaleDB is speed and ease of maintenance! Although, I should note that our team has not yet done a serious performance comparison with regular PostgreSQL.
At the moment, TimescaleDB is a fairly young and rapidly developing product. Use with caution!
Easy tag management
If earlier tags could only be managed at the trigger level, now tag management is much more flexible. Zabbix supports tags for templates and hosts!
All detected problems receive tags not only of the trigger, but also of the host, as well as templates of this host.
More flexible auto-registration
Zabbix 4.2 allows you to filter hosts by name using regular expressions. This makes it possible to create different discovery scenarios for different groups of hosts. It is especially convenient if we use complex device naming rules.
More flexible network discovery
Another improvement is related to the naming of hosts. Now you can manage device names in network discovery and get the device name from the metric value.
This is a very useful feature, especially for network discovery with SNMP and Zabbix agent.
Checking the functionality of notification methods
Now, right from the Web interface, you can send yourself a test message and check if the notification method works. This functionality is especially useful for testing Zabbix federation scripts with various alert systems, task systems, and other external programs and APIs.
Remote monitoring of Zabbix infrastructure components
Now you can remotely monitor the internal metrics of Zabbix server and proxy (performance and health metrics of Zabbix components).
What is it for? The functionality allows you to monitor the internal metrics of servers and proxies from the side, allows you to quickly detect and notify about problems even if the components themselves are overloaded or, for example, there is a large amount of unsent data on the proxy.
HTML format support for email messages
Now we are not limited to plain text and can create beautiful e-mail messages thanks to the support of the HTML format. Time to learn HTML + CSS!
Access to external systems from network cards
There is support for a whole set of new macros in custom URLs for better integration of maps with external systems. This allows one or two clicks on the host icon to open, for example, a ticket in the task system.
The discovery rule can be a dependent item
Why is this necessary - you ask. This allows the core metric data to be used for both discovery and direct data collection. For example, in the case of collecting data from the Prometheus exporter, Zabbix will make one HTTP request and immediately use the received information for all dependent data elements: metric values and low-level discovery rules.
A new way to visualize issues on maps
Added support for animated GIF images on maps for more visible visualization of problems.
Extracting data from HTTP headers in Web Monitoring
Added the ability to select data from the received HTTP header in Web Monitoring.
This allows you to create multi-step web monitoring or third-party API monitoring scripts using the authorization token obtained in one of the steps.
Zabbix Sender uses all IP addresses
Zabbix Sender now sends data to all IP addresses from the ServerActive parameter of the agent configuration file.
Convenient new filter in trigger configuration
The trigger configuration page has got an advanced filter for quick and convenient selection of triggers according to the specified criteria.
Showing the exact time
Everything is simple here, now Zabbix shows the exact time when you hover over the chart with the mouse.
Other innovations
- Implemented a more predictable algorithm for changing the order of widgets in the dashboard
- Possibility of mass change of parameters of item prototypes
- IPv6 support for DNS checks: "net.dns" and "new.dns.record"
- Added "skip" parameter for "vmware.eventlog" checks
- Preprocessing step execution error includes step number
How do I upgrade?
To upgrade from earlier versions, you only need to install
We are hosting free webinars for those who want to learn more about Zabbix 4.2 and have the opportunity to ask questions to the Zabbix team.
Don't forget the popular
Useful links
—
—
—
Source: habr.com