FairMOT, a system for quickly tracking multiple objects on video

Researchers from Microsoft and Central China University have developed a new high-performance method for tracking multiple objects on video using machine learning technologies - FairMOT (Fair Multi-Object Tracking). Pytorch-based method implementation code and trained models published on GitHub.

Most existing object tracking methods use two stages, each of which is implemented by a separate neural network. At the first stage, the object-of-interest location model is performed, and at the second stage, the association search model is used to re-identify the objects and bind anchors to them.

FairMOT uses a single-stage implementation based on a deformable convolutional neural network (DCNv2, Deformable Convolutional Network), which allows you to achieve a noticeable increase in the speed of object tracking. FairMOT works without anchors, using a re-identification mechanism to determine the offsets of the centers of objects on a high-precision map of objects. In parallel, a processor is executed that evaluates the individual features of objects that can be used to predict their identity, and the main module performs the convergence of these features to manipulate objects of different scales.

FairMOT, a system for quickly tracking multiple objects on video

A combination of six public person discovery and search datasets (ETH, CityPerson, CalTech, MOT17, CUHK-SYSU) was used to train the model in FairMOT. The model has been tested using validation video sets 2DMOT15, MOT16, MOT17 ΠΈ MOT20provided by the project MOT Challenge and covering different situations, camera movement or rotation, different viewing angles. The testing carried out showed that
FairMOT is ahead the fastest competing models TrackRCNN ΠΈ J.D.E. when tested on video streams at 30 frames per second, demonstrating performance sufficient to analyze conventional video streams on the fly.

Source: opennet.ru

Add a comment