Experiment using SQLite as a container for archiving files

The Pack project attempted to create a file archiving format based on the SQLite library and the ZSTD (Zstandard) compression algorithm. The prepared prototype, written in Pascal and distributed under the Apache 2.0 license, outperformed the most common archivers in the speed of creating archives, despite the fact that its work was limited to reading data, compressing it with the libzstd library and performing SQL operations to add compressed data to a file with a database SQLite.

When compressing a directory with 81 thousand files with a total size of 1.25 GB, pack was 112 times faster than the ZIP utility, completing the operation in 1.3 seconds versus 146 seconds for ZIP. The archive size for pack was 23% smaller (194 MB for Pack and 253 MB for ZIP). For comparison, the tar utility completed the packaging in 4.7 seconds without compression and in 28.5 seconds with gzip compression, the RAR archiver completed the test in 27.5 seconds, and 7z in 54.2 seconds. The archive sizes were: tar.gz - 214 MB, RAR - 235 MB, 7z - 135 MB. It is noted that in terms of speed of unpacking and random access to files, Pack is also ahead of other archivers, while consuming less RAM. ZIP: 253 MB, 146 s 7z: 135 MB, 54.2 s faster ZIP 2.7 times tar.gz: 214 MB, 28.5 sx 5.1 RAR: 235 MB, 27.5 sx 5.3 tar: 1345 MB, 4.7 sx 31 Pack: 194 MB, 1.3 SX 112

There is no mention of the effect of the file cache on the test results. Probably, the low speed of ZIP is due to the order in which the tests were launched without regard to data caching in memory - the test with zip was launched with a cold cache, and the rest of the tests with a warm cache. Under normal conditions, Zstandard exhibits 3-5 times faster compression speeds than zlib and 10 times faster decompression, with 15-XNUMX% higher compression levels.

Addition: A similar idea of ​​storing compressed files in the form of blobs in a SQLite database was implemented in 2014 in the sqlar archiver, created by the SQLite developers as an experiment to evaluate the efficiency of storing blobs in SQLite. sqlar uses zlib for compression and the file size is about 2% larger than the ZIP utility.

Source: opennet.ru

Add a comment