DIY: how we automate warehouse monitoring

X5 manages 43 distribution centers and 4 of its own trucks, ensuring uninterrupted supply of products to 029 stores. In the article I will share my experience of creating an interactive warehouse event monitoring system from scratch. The information will be useful for logisticians of trading companies with several dozen distribution centers that manage a wide range of products.

DIY: how we automate warehouse monitoring

As a rule, the construction of systems for monitoring and managing business processes begins with the processing of messages and incidents. This misses an important technological point associated with the possibility of automating the very fact of the occurrence of business events and registering incidents. Most business systems of the WMS, TMS, etc. class have built-in tools for monitoring their own processes. But, if these are systems from different manufacturers or the monitoring functionality is not sufficiently developed, you have to order expensive improvements or involve specialized consultants for additional settings.

Let's consider an approach in which we need only a small part of consulting related to the definition of sources (tables) for obtaining indicators from the system.

The specificity of our warehouses lies in the fact that several warehouse management systems (WMS Exceed) operate at one logistics complex. Warehouses are divided according to the categories of storage of goods (dry, alcohol, freezing, etc.) not only logically. Within one logistics complex there are several separate warehouse buildings, operations on each of them are controlled by their own WMS.

DIY: how we automate warehouse monitoring

To form a general picture of the processes taking place in the warehouse, managers several times a day analyze the reports of each WMS, process the messages of the warehouse operators (receivers, pickers, stackers) and summarize the actual operational indicators for reflection on the information board.

To save time for managers, we decided to develop an inexpensive system for operational control of warehouse events. The new system, in addition to displaying "hot" indicators of the operational work of warehouse processes, should also help managers in fixing incidents and monitoring the implementation of tasks to eliminate the causes that affect the specified indicators. After conducting a general audit of the company's IT architecture, we realized that certain parts of the required system already exist in our landscape one way or another, and for them there is both an examination of the settings and the necessary support services. It remains only to bring the whole concept into a single architectural solution and evaluate the scope of development.

After assessing the amount of work that needs to be done to build a new system, it was decided to break the project into several stages:

  1. Collection of indicators for warehouse processes, visualization and control of indicators and deviations
  2. Automation of process regulations and registration of requests in the business services service for deviations
  3. Proactive monitoring with load forecasting and making recommendations to managers.

At the first stage, the system must collect prepared slices of operational data from all WMS of the complex. Reading takes place almost in real time (intervals less than 5 minutes). The trick is that data must be obtained from the DBMS of several dozen warehouses when deploying the system to the entire network. The received operational data is processed by the logic of the system core to calculate deviations from the planned indicators and calculate statistics. The data processed in this way must be displayed on the manager's tablet or on the information board of the warehouse in the form of understandable graphs and diagrams.

DIY: how we automate warehouse monitoring

When choosing a variant of a suitable system for the pilot implementation of the first stage, we settled on Zabbix. This system is already being used to monitor the IT performance of warehouse systems. By adding a separate installation to collect warehouse business metrics, you can get an overall picture of warehouse health.

The general architecture of the system turned out as in the figure.

DIY: how we automate warehouse monitoring

Each WMS instance is defined as a host for the monitoring system. Metrics are collected by a central server in the data center network by running a script with a prepared SQL query. If you need to monitor a system that does not recommend direct access to the database (for example, SAP EWM), you can use script calls to documented API functions to get indicators or write a simple python/vbascript program.

A Zabbix proxy instance is deployed in the warehouse network to distribute the load from the main server. Proxy provides work with all local instances of WMS. When the Zabbix server next requests parameters, a script is executed on the host with Zabbix proxy to request metrics from the WMS database.

To display graphs and warehouse indicators on the central Zabbix server, deploy Grafana. In addition to displaying prepared dashboards with warehouse operation infographics, Grafana will be used to control deviations in indicators and transfer automatic alerts to the warehouse service system to work with business incidents.

As an example, let's consider the implementation of loading control of the warehouse receiving area. The following were chosen as the main indicators of the processes in this section of the warehouse:

  • the number of vehicles in the acceptance area, taking into account the statuses (planned, arrived, documents, unloading, departure;
  • workload of accommodation and replenishment areas (according to storage conditions).

Setting

Installation and configuration of the main components of the system (SQLcl, Zabbix, Grafana) is described in various sources and we will not repeat here. The use of SQLcl instead of SQLplus is due to the fact that SQLcl (Oracle DBMS command line interface written in java) does not require additional Oracle Client installation and works out of the box.

I will describe the main points that you should pay attention to when using Zabbix to monitor warehouse business process indicators, and one of the possible ways to implement them. Also, this post is not about security. The security of connections and the use of the presented methods needs additional elaboration in the process of transferring the pilot solution to productive operation.

The main thing is that when implementing such a system, it is possible to do without programming, using the settings provided by the system.

The Zabbix monitoring system provides several options for collecting monitored system metrics. This can be done either by directly polling controlled hosts, or by a more advanced method of sending data to the server via the host's zabbix_sender, including methods for setting low-level discovery parameters. To solve our problem, the method of direct polling of hosts by the central server is quite suitable, because this allows you to have full control over the sequence of obtaining metrics and ensures the use of one package of settings / scripts without the need to distribute them to each controlled host.

As "guinea pigs" for debugging and configuring the system, we use WMS worksheets to manage acceptance:

  1. Vehicles on acceptance, all that have arrived: All vehicles with statuses for the period "- 72 hours from the current time" - identifier of the SQL query: getCars.
  2. History of all vehicle statuses: Statuses of all vehicles with arrival within 72 hours — SQL query identifier: carsHistory.
  3. Scheduled vehicles for acceptance: Statuses of all vehicles with arrival in the "Scheduled" status, time interval "- 24 hours" and "+24 hours" from the current time - SQL query identifier: carsIn.

So, after we have decided on a set of warehouse operation metrics, we will prepare SQL queries for the WMS database. To execute queries, it is desirable to use not the main database, but its “hot” copy - standby.

Connecting to standby Oracle DBMS to get data. IP address to connect to the test database 192.168.1.106. The connection parameters are saved on the Zabbix server in TNSNames.ORA of the SQLcl working folder:

# cat  /opt/sqlcl/bin/TNSNames.ORA
WH1_1=
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.106)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME =  WH1_1)
    )
  )

This will allow us to run SQL queries to each host through EZconnect, specifying only the login/password and the database name:

# sql znew/Zabmon1@WH1_1

Prepared SQL queries are saved in the working folder on the Zabbix server:

/etc/zabbix/sql

and allow access to the zabbix user of our server:

# chown zabbix:zabbix -R /etc/zabbix/sql

Request files receive a unique identifier-name to be accessed by the Zabbix server. Each query to the database through SQLcl returns us several parameters. Taking into account the specifics of Zabbix, which can process only one metric in a request, we will use additional scripts to parse the query results into separate metrics.

We are preparing the main script, let's call it wh_Metrics.sh, to call an SQL query to the database, save the results and return a technical metric with indicators of the success of obtaining data:

#!/bin/sh 
## настройка окружения</i>
export ORACLE_HOME=/usr/lib/oracle/11.2/client64
export PATH=$PATH:$ORACLE_HOME/bin
export LD_LIBRARY_PATH=$ORACLE_HOME/lib:/usr/lib64:/usr/lib:$ORACLE_HOME/bin
export TNS_ADMIN=$ORACLE_HOME/network/admin
export JAVA_HOME=/
alias sql="opt/sqlcl/bin/sql"
## задаём путь к файлу с sql-запросом и параметризованное имя файла
scriptLocation=/etc/zabbix/sql
sqlFile=$scriptLocation/sqlScript_"$2".sql
## задаём путь к файлу для хранения результатов
resultFile=/etc/zabbix/sql/mon_"$1"_main.log
## настраиваем строку подключения к БД
username="$3"
password="$4"
tnsname="$1"
## запрашиваем результат из БД
var=$(sql -s $username/$password@$tnsname < $sqlFile)
## форматируем результат запроса и записываем в файл
echo $var | cut -f5-18 -d " " > $resultFile
## проверяем наличие ошибок
if grep -q ora "$resultFile"; then
    echo null > $resultFile
    echo 0
else
    echo 1
fi

We place the finished file with the script in the folder for placing external scripts in accordance with the Zabbix-proxy configuration settings (by default - /usr/local/share/zabbix/externalscripts).

The identification of the database from which the script will receive results will be passed as a script parameter. The database identifier must match the settings string in the TNSNames.ORA file.

The result of the SQL query call is stored in the view file mon_base_id_main.log where base_id = database identifier received as a script parameter. Separation of the result file by database identifiers is provided for the case of requests from the server simultaneously to several databases. The query returns a sorted two-dimensional array of values.

The following script, let's call it getMetrica.sh, is needed to get a given metric from a file with a query result:

#!/bin/sh 
## определяем имя файла с результатом запроса
resultFile=/etc/zabbix/sql/mon_”$1”_main.log
## разбираем массив значений результата средствами скрипта:
## при работе со статусами, запрос возвращает нам двумерный массив (RSLT) в виде 
## {статус1 значение1 статус2 значение2…} разделённых пробелами (значение IFS)
## параметром запроса передаём код статуса и скрипт вернёт значение
IFS=’ ‘
str=$(cat $resultFile)
status_id=null
read –ra RSLT <<< “$str”
for i in “${RSLT[@]}”; do
if [[ “$status_id” == null ]]; then
status_id=”$I"
elif [[ “$status_id” == “$2” ]]; then
echo “$i”
break
else
status_id=null
fi
done

Now we are ready to set up Zabbix and start monitoring the indicators of the warehouse receiving processes.

Each database node has a Zabbix agent installed and configured.

On the main server, we define all servers with Zabbix proxy. For settings, go to the following path:

Administration → Proxy → Create proxy

DIY: how we automate warehouse monitoring

Define controlled hosts:

Setup → Hosts → Create Host

DIY: how we automate warehouse monitoring

The hostname must match the hostname specified in the agent's configuration file.

Specify the group for the node, as well as the IP address or DNS name of the node from the database.

Create metrics and specify their properties:

Settings → Nodes → 'host name' → Items>Create item

1) Create a main metric to query all parameters from the database

DIY: how we automate warehouse monitoring

Set the name of the data element, specify the type “External check”. In the “Key” field, we define a script, to which we pass the name of the Oracle database, the name of the sql query, the login and password for connecting to the database as parameters. Set the query update interval to 5 minutes (300 seconds).

2) Create other metrics for each vehicle status. The values ​​of these metrics will be generated based on the result of checking the main metric.

DIY: how we automate warehouse monitoring

Set the name of the data element, specify the type “External check”. In the “Key” field, we define a script, to which we pass the name of the Oracle database and the status code, the value of which we want to track, as parameters. We set the query update interval to 10 seconds more than the main metric (310 seconds) so that the results have time to write to the file.

To get the metrics correctly, the order in which checks are activated is important. In order to avoid conflicts when receiving data, first of all we activate the main metric GetCarsByStatus with a script call - wh_Metrics.sh.

Settings → Nodes → 'node name' → Items → Subfilter “External checks”. We mark the necessary check and click "Activate".

DIY: how we automate warehouse monitoring

Next, we activate the remaining metrics in one operation, selecting them all together:

DIY: how we automate warehouse monitoring

Now Zabbix has started collecting warehouse business metrics.

In the following articles, we will take a closer look at the connection of Grafana and the formation of information dashboards of warehouse work for various categories of users. Also on the basis of Grafana, control of deviations in the work of the warehouse is implemented and, depending on the boundaries and frequency of deviations, registration of incidents in the warehouse management service center system via the API or simple sending of notifications to the manager by e-mail.

DIY: how we automate warehouse monitoring

Source: habr.com

Add a comment