Statistics and monitoring of PHP scripts in real time. ClickHouse and Grafana go to the aid of Pinba

In this article I will show you how to use pinba with clickhouse and grafana instead of pinba_engine and pinboard.

On a php project, pinba is perhaps the only reliable way to understand what is happening with performance. True, pinba is usually implemented only when problems are already observed and it is not clear β€œwhere to dig”.

Often, no one has any idea how many times per second/minute a particular script is called and they start optimizing "by touch", starting from those places that seem more logical.

Someone analyzes nginx logs, and someone analyzes slow queries in the database.

Of course, pinba would not be superfluous, but there are several reasons why not every project has it.

Statistics and monitoring of PHP scripts in real time. ClickHouse and Grafana go to the aid of Pinba

And the first reason is the setting.

In order to more or less get some kind of "exhaust" from the introduction of pinba, it is very desirable to see metrics not only for the last minutes, but also for a long period of time (from days to months).

To do this:

  • install a php extension (and you might want an nginx module)
  • compile extension for mysql
  • install pinboard and set up cron

Due to the small amount of information about pinb, many have the impression that it only worked on php5 and is long gone, but as we will see further, this is not the case.

The first step is the easiest, all you need to do is run the command:

apt install php-pinba

The repositories have this extension up to and including php 7.3 and you don't need to compile anything.

After executing the installation command, we immediately get an already working extension that collects and sends the metrics of each script (duration, memory, etc.) in the format protobuf by udp to 127.0.0.1:30002.

Nobody catches and processes these udp packets yet, but this does not adversely affect the speed or stability of your php scripts.

Until recently, as an application that could catch and process these udp packets, there was only pinba_engine. Description "simple and conciseΒ» installation discourages the desire to read and delve into it ever again. In the kilometer-long lists of dependencies there are both package names and program names and links to individual pages with their installation, and those have their own links to other dependencies. No one has the time or inclination to deal with this crap.

The installation process pinba2 did not especially lighter.

Perhaps someday it will be possible to install pinba10 with one or two commands and not read a lot of material to understand how to do it, but so far this is not the case.

If you did install pinba_engine, then that's only half the battle. After all, without pin board you'll have to limit yourself to just the last few minutes, or aggregate, store, and visualize the data yourself. It's good that the pinboard is easy enough to use. installation.

It would seem, why such suffering if all the metrics from php are already going to the udp port in the protobuf format and all that is needed is to write an application that will catch them and store them in some kind of storage? Apparently, those developers who came up with this idea immediately sat down to write their bicycles, some of which ended up on the github.

What follows is an overview of four open-source projects that store metrics in storage, from which this data is easy to get and visualize, for example, using grafana.

olegfedoseev/pinba-server (November 2017)

udp server on go that stores metrics in OpenTSDB. Perhaps if OpenTSDB is already used on your project, then this solution will suit you, otherwise I recommend passing by.

olegfedoseev/pinba-influxdb (June 2018)

udp server on go, from the same habrauser, which this time stores the metrics in InfluxDB. Many projects already use InfluxDB for monitoring, so this solution can be perfect for them.

Pros:

  • InfluxDB Allows aggregate the received metrics, and delete the original after a specified time.

Cons:

ClickHouse-Ninja/Proton (January 2019)

udp server on go that saves metrics in ClickHouse. This is my friend's solution. It was after getting acquainted with him that I decided that it was time to take on pinba and clickhouse.

Pros:

  • clickhouse is ideal for such tasks, it allows you to compress the data so much that you can store all the raw data even without aggregations
  • if required, you can easily aggregate the received metrics
  • ready-made template for grafana
  • saves timer information

Cons:

  • fatal flaw
  • there is no config in which you could set the name of the database and tables, the address and port of the server.
  • when saving raw data, an auxiliary dictionary table is used to store page and domain addresses, which complicates queries later
  • other little things that follow from the first minus

pinba-server/pinba-server (April 2019)

udp server in php that saves metrics in ClickHouse. This is my solution, which is the result of familiarity with pinba, ClickHouse and protobuf. While I was dealing with this whole bunch, I wrote a β€œproof of concept”, which, unexpectedly for me, did not consume significant resources (30 MB of RAM and less than 1% of one of the eight processor cores), so I decided to share it with the public.

Pluses - the same as the previous solution, I also used the usual names from the original pinba_engine. I also added a config that allows you to run several pinbaserver instances at once in order to save metrics to different tables - this is useful if you want to collect data not only from php, but also from nginx.
Cons - a "fatal flaw" and those little things that will not suit you personally, but my solution is "simple as a slipper" and consists of only about 100 lines of code, so any php developer can change what he does not like in a couple of minutes.

Principle of operation

Listening on udp port 30002. All incoming packets are decoded according to the protobuf scheme and aggregated. Once a minute, a clickhouse batch is inserted into the pinba.requests table. (all parameters are configured in config)

A little about clickhouse

Clickhouse supports different storage engines. The most commonly used is MergeTree.

If at some point you decide to store aggregated data for all time, and raw data only for the last one, then you can create a materialized view with grouping, and periodically clean the main pinba.requests table, while all data will remain in the materialized view. Moreover, when creating the pinba.requests table, you can specify β€œengine = Null”, then the raw data will not be saved to disk at all, and at the same time, it will still fall into the materialized view and remain aggregated. I use this scheme for nginx metrics, because I have 50 times more requests on nginx than on php.

So, you have come a long way and I would not like to leave you halfway, so what follows is a detailed description of the installation and configuration of my solution and everything you need, as well as the pitfalls that more than one ship crashed. The entire installation process is described for Ubuntu 18.04 LTS and Centos 7, on other distributions and versions the process may differ slightly.

Installation

I put all the necessary commands in Dockerfile to facilitate reproducibility of instructions. Only the pitfalls will be described below.

php pinba

After installation, make sure that all options are uncommented in the /etc/php/7.2/fpm/conf.d/20-pinba.ini file. In some distributions (eg centos) they may be commented out.

extension=pinba.so
pinba.enabled=1
pinba.server=127.0.0.1:30002

clickhouse

During installation, clickhouse will ask you to set a password for the default user. By default, this user is available from all ip, so if you do not have a firewall on the server, be sure to set a password for it. This can also be done after installation in the /etc/clickhouse-server/users.xml file.

It is also worth noting that clickhouse uses several ports, including 9000. This port is also used for php-fpm in some distributions (for example, centos). If you already use this port, then you can change it to another one in the /etc/clickhouse-server/config.xml file.

grafana with clickhouse plugin

After installing grafana, use the login admin and password admin. When you first log in, Grafana will ask you to set a new password.

Next, go to the menu "+" -> import and specify the number of the dashboard to import 10011. I prepared and uploaded this dashboard so that you do not have to do it yourself again.

Grafana supports working with clickhouse through a third-party plugin, but for third-party plugins, grafana does not have alerts (the ticket for this has been weighing for several years).

pinba-server

Installing protobuf and libevent is optional, but improves pinba-server performance. If you install pinba-server in a folder other than /opt, then you will also need to tweak systemd script file.

pinba module under nginx

To compile the module, you need sources of the same version of nginx that is already installed on your server, as well as the same compilation options, otherwise the build will be successful, but when you connect the module, an error will be generated that "the module is not binary compatible". Compilation options can be viewed using the nginx -V command

Life hacks

All my sites work only on https. The schema field becomes meaningless, so I use it to separate the web/console.

In scripts that are available from the web, I use:

if (ini_get('pinba.enabled')) {
    pinba_schema_set('web');
}

And in console (for example, cron scripts):

if (ini_get('pinba.enabled')) {
    pinba_schema_set('console');
}

My grafana dashboard has a web/console switch to view statistics separately.

You can also send your own tags to pinbu, for example:

pinba_tag_set('country', $countryCode);

That's all.

A big request to answer the polls under the article.

I traditionally warn that I do not advise and do not help through Habr's personal messages and social networks.

Create a ticket on github.

Also please support with likes English version this article on reddit.

Only registered users can participate in the survey. Sign in, you are welcome.

What OS are you using on the server?

  • Ubuntu

  • CentOS

  • Debian

  • Gentoo

  • Red Hat

  • Fedora

  • OpenSUSE

  • SuSE

  • Unix

  • Windows

  • other

114 users voted. 11 users abstained.

What version of php are you using on the server?

  • 7.3

  • 7.2

  • 7.1

  • 7.0

  • 5

  • other

105 users voted. 17 users abstained.

Have you ever used pinba?

  • Yes

  • no, but would like to

  • no and don't want to

  • no, never heard of her

100 users voted. 14 users abstained.

Which version of the pinba server would you like to try?

  • pinba_engine (mysql engine)

  • pinba2 (mysql engine)

  • pinboard (php + mysql)

  • olegfedoseev/pinba-server (go + OpenTSDB)

  • olegfedoseev/pinba-influxdb (go + influxdb)

  • pinba-server/pinba-server (go + clickhouse)

  • pinba-server/pinba-server (php + clickhouse)

  • write my own

  • other

39 users voted. 47 users abstained.

Source: habr.com

Add a comment