The Gentoo project introduced the Portage 3.0 package management system

stabilized package management system release Portage 3.0used in the distribution Gentoo Linux. This thread summed up the long work of moving to Python 3 and deprecating support for Python 2.7.

In addition to the end of support for Python 2.7, another major change was the inclusion of optimizations, which made it possible to speed up the calculations related to the determination of dependencies by 50-60%. Interestingly, some developers suggested rewriting the dependency resolution code in C / C ++ or Go to speed up its work, but they managed to solve the existing problem with little bloodshed.

Profiling the existing code showed that most of the computation time is spent calling the use_reduce and catpkgsplit functions with a repeating set of arguments (for example, the catpkgsplit function was called from 1 to 5 million times). To speed up the caching of the result of these functions using dictionaries was applied. The lru_cache built-in function was optimal for cache storage, but it was only available in Python releases starting with 3.2. For compatibility with earlier versions, a stub was added to replace lru_cache, but the decision to end support for Python 2.7 in Portage 3.0 greatly simplified the task and eliminated this layer.

The use of the cache reduced the execution time of the "emerge -uDvpU --with-bdeps=y @world" operation on the ThinkPad X220 laptop from 5 minutes 20 seconds to 3 minutes 16 seconds (63%). Tests on other systems showed a performance increase of at least 48%.

The developer who prepared the change also tried to prototype the dependency resolution code in C++ or Rust, but the task turned out to be too difficult, as it required porting a large amount of code, and, at the same time, it was doubtful that the result would be worth the effort.

Source: opennet.ru

Add a comment