The article describes how, when implementing WMS-systems, we are faced with the need to solve a non-standard clustering problem and what algorithms we used to solve it. We will tell you how we applied a systematic, scientific approach to solving the problem, what difficulties we encountered and what lessons we learned.
This publication begins a series of articles in which we share our successful experience in implementing optimization algorithms in warehouse processes. The purpose of the series of articles is to acquaint the audience with the types of warehouse operations optimization problems that arise in almost any medium and large warehouse, as well as to tell about our experience in solving such problems and the pitfalls encountered along the way. The articles will be useful to those who work in the warehouse logistics industry, implement WMS-systems, as well as programmers who are interested in applications of mathematics in business and optimization of processes in the enterprise.
Process bottleneck
In 2018, we made a project to implement WMS-systems in the warehouse of the company "Trading House" LD "in Chelyabinsk. Implemented the product "1C-Logistics: Warehouse Management 3" for 20 jobs: operators WMS, storekeepers, forklift drivers. The average warehouse is about 4 thousand m2, the number of cells is 5000 and the number of SKUs is 4500. Ball valves of our own production of various sizes from 1 kg to 400 kg are stored in the warehouse. Stocks in the warehouse are stored in the context of batches, as there is a need to select goods according to FIFO.
When designing schemes for automating warehouse processes, we faced the existing problem of non-optimal inventory storage. The specificity of storage and laying of cranes is such that only the nomenclature of one batch can be in one cell of piece storage. Products arrive at the warehouse daily and each arrival is a separate batch. In total, as a result of 1 month of warehouse operation, 30 separate batches are created, moreover, each should be stored in a separate cell. The goods are often taken not in whole pallets, but in pieces, and as a result, in the piece selection zone in many cells, the following picture is observed: in a cell with a volume of more than 1 m3, there are several pieces of cranes that occupy less than 5-10% of the volume of the cell.
Fig 1. Photo of several items in a cell
On the face of non-optimal use of storage capacity. To imagine the scale of the disaster, I can cite figures: on average, there are from 1 to 3 such cells with a volume of more than 100 m300 with βnegligibleβ balances in different periods of the warehouse operation. Since the warehouse is relatively small, during warehouse loading seasons this factor becomes a βbottleneckβ and greatly slows down warehouse processes.
Problem solving idea
An idea arose: to bring the batches of residues with the closest dates to one single batch and place such residues with a unified batch compactly together in one cell, or in several, if there is not enough space in one to accommodate the entire number of residues.
Fig.2. Cell Residue Compression Scheme
This allows you to significantly reduce the occupied storage space that will be used for the new placed goods. In a situation with an overload of warehouse capacities, such a measure is extremely necessary, otherwise there may simply not be enough free space for placing a new product, which will lead to a stop in the warehouse processes of placement and replenishment. Before implementation WMS-systems performed such an operation manually, which was inefficient, since the process of searching for suitable residues in the cells was quite long. Now, with the introduction of the WMS system, we decided to automate the process, speed up and make it intelligent.
The process of solving such a problem is divided into 2 stages:
- at the first stage, we find groups of parties close in date for compression;
- at the second stage, for each group of batches, we calculate the most compact placement of product residues in the cells.
In the current article, we will focus on the first stage of the algorithm, and leave the coverage of the second stage for the next article.
Search for a mathematical model of the problem
Before we sit down to write code and reinvent our wheel, we decided to approach such a problem scientifically, namely: formulate it mathematically, reduce it to a well-known discrete optimization problem and use effective existing algorithms to solve it, or take these existing algorithms as a basis and modify them according to the specifics of the practical problem being solved.
Since it clearly follows from the business statement of the problem that we are dealing with sets, we will formulate such a problem in terms of set theory.
Let - the set of all batches of remnants of some goods in the warehouse. Let is a given constant of days. Let - a subset of parties, where the difference of dates for all pairs of parties of the subset does not exceed a constant . It is required to find the minimum number of non-overlapping subsets , such that all subsets together would give many .
In other words, we need to find groups or clusters of similar parties, where the similarity criterion is determined by the constant . Such a task reminds us of the well-known clustering problem. It is important to say that the problem under consideration differs from the clustering problem, in that our problem has a hard-coded condition according to the criterion of similarity of cluster elements, determined by the constant , while there is no such condition in the clustering problem. The statement of the clustering problem and information on this problem can be found
So, we managed to formulate the problem and find a classical problem with a similar statement. Now it is necessary to consider well-known algorithms for solving it, so as not to reinvent the wheel, but to take the best practices and apply them. To solve the clustering problem, we considered the most popular algorithms, namely: -means -means, connected component selection algorithm, minimum spanning tree algorithm. Description and analysis of such algorithms can be found
To solve our problem, clustering algorithms -means and -means are not applicable at all, since the number of clusters is never known in advance and such algorithms do not take into account the limitation of the constant days. Such algorithms were initially discarded from consideration.
To solve our problem, the algorithm for extracting connected components and the minimum spanning tree algorithm are more suitable, but, as it turned out, they cannot be applied βhead-onβ to the problem being solved and get a good solution. To clarify this, consider the logic of such algorithms in relation to our problem.
Consider the graph , in which the vertices are the set of parties , and the edge between the vertices ΠΈ has a weight equal to the difference of days between parties ΠΈ . In the algorithm for extracting connected components, the input parameter is set Where , and in the graph remove all edges for which the weight is greater . Only the closest pairs of objects remain connected. The purpose of the algorithm is to find such a value , at which the graph "falls apart" into several connected components, where the parties belonging to these components will satisfy our similarity criterion, determined by the constant . The resulting components are the clusters.
The minimum spanning tree algorithm first builds on the graph minimum spanning tree, and then sequentially removes the edges with the highest weight until the graph "falls apart" into several connected components, where the parties belonging to these components will also satisfy our similarity criterion. The resulting components will be clusters.
When using such algorithms to solve the problem under consideration, a situation may arise as in Figure 3.
Fig 3. Application of clustering algorithms to the problem being solved
Let's say we have a constant difference in the days of parties equal to 20 days. Graph was depicted in spatial form for the convenience of visual perception. Both algorithms gave a 3-cluster solution, which can be easily improved by merging batches placed in separate clusters together! It is obvious that such algorithms need to be modified for the specifics of the problem being solved, and their application in its pure form to solve our problem will give poor results.
So, before starting to write the code of graph algorithms modified for our task and inventing our own bicycle (in the silhouettes of which the outlines of square wheels were already guessed), we, again, decided to approach this task scientifically, namely: to try to reduce it to another task of discrete optimization, in the hope that existing algorithms for solving it can be applied without modification.
Another search for a similar classical problem was crowned with success! We managed to find a discrete optimization problem, the formulation of which 1 in 1 coincides with the formulation of our problem. This task turned out to be set covering problem. Let us present the formulation of the problem as applied to our specifics.
There is a finite set and family of all its disjoint subsets of parties such that the date difference for all pairs of parties of each subset from the family does not exceed a constant . A covering is a family the smallest power, the elements of which belong , such that the union of the sets from the family should give the set of all parties .
A detailed analysis of this problem can be found
Algorithm for solving the problem
We have decided on the mathematical model of the problem to be solved. Now let's start considering an algorithm for solving it. Subsets from the family can be easily found by the following procedure.
- Order batches from a set in descending order of their dates.
- Find the minimum and maximum dates of parties.
- For every day from minimum date to maximum find all batches whose dates differ from no more than (so the value it is better to take an even number).
The logic of the procedure for forming a family of sets with days is shown in Figure 4.
Fig.4. Formation of subsets of parties
In such a procedure, it is not necessary for each iterate through all other parties and check the difference between their dates, or from the current value move left or right until you have found a game whose date differs from more than half the value of the constant. All subsequent elements, when moving both to the right and to the left, will not be of interest to us, since for them the difference in days will only increase, since the elements in the array were initially ordered. This approach will be a significant time saver when the number of parties and the spread of their dates is much larger.
The set covering problem is -difficult, which means that there is no fast (with running time equal to the polynomial of the input data) and exact algorithm for solving it. Therefore, to solve the set covering problem, a fast greedy algorithm was chosen, which, of course, is not exact, but has the following advantages:
- For problems of small dimension (and this is just our case), it calculates solutions close enough to the optimum. As the size of the problem grows, the quality of the solution deteriorates, but still rather slowly;
- Very easy to implement;
- Fast, since the estimate of its running time is .
The greedy algorithm chooses sets according to the following rule: at each stage, a set is chosen that covers the maximum number of elements not yet covered. A detailed description of the algorithm and its pseudocode can be found
Comparison of the accuracy of such a greedy algorithm on the test data of the problem being solved with other known algorithms, such as the probabilistic greedy algorithm, the ant colony algorithm, etc., was not made. The results of comparing such algorithms on generated random data can be found
Implementation and implementation of the algorithm
Such an algorithm was implemented in the language 1S and was included in an external processing called "Residue Compression", which was connected to WMS-system. We did not implement the algorithm in the language C ++ and use it from an external Native component, which would be more correct, since the speed of the code on C++ at times and in some examples even tens of times faster than the speed of a similar code on 1S. On the tongue 1S the algorithm was implemented to save development time and ease of debugging on the customer's working base. The result of the algorithm is shown in Figure 5.
Fig.5. Processing by "compression" of residues
Figure 5 shows that in the indicated warehouse, the current balances of goods in storage cells are divided into clusters, within which the dates of the consignments of goods differ from each other by no more than 30 days. Since the customer manufactures and stores in stock metal ball valves, which have a shelf life of years, such a difference in dates can be neglected. Note that currently such processing is used systematically in production, and operators WMS confirm the good quality of batch clustering.
Conclusions and continuation
The main experience that we got from solving such a practical problem is the confirmation of the effectiveness of using the paradigm: mat. problem statement famous mat. model known algorithm algorithm taking into account the specifics of the problem. Discrete optimization has been around for more than 300 years, and during this time people have managed to consider a lot of problems and gain a lot of experience in solving them. First of all, it is more expedient to turn to this experience, and only then begin to reinvent your wheel.
In the next article, we will continue the story of optimization algorithms and consider the most interesting and much more complex: the algorithm for optimal "compression" of residuals in cells, which uses data obtained from the batch clustering algorithm as input.
Prepared an article
Roman Shangin, programmer of the project department,
company First BIT, Chelyabinsk
Source: habr.com