OpenZL compression system, pamberi peZstd uye XZ mukumhanya uye kumanikidza kweiyo data yakarongeka

Meta* yakaunza OpenZL, data compression uye decompression toolkit inopa yakakwirira compression mazinga uye kumhanya kupfuura Zstd uye XZ mafomati. OpenZL yakagadzirirwa kudzvanya kwakanaka kwemadatasets akarongwa, seaya anoshandiswa mukudzidza muchina, pamwe nezvitoro zvedata zvine minda ine akasiyana anodzokororwa marudzi eruzivo. OpenZL yakanyorwa muC/C++ uye yakavhurika-yakavhurika pasi peiyo BSD rezinesi.

Pakudzvanya dhatabhesi rine SAO astronomical star catalogue, OpenZL yakaderedza saizi yedata nechikamu che2.06, ukuwo zstd algorithm yakadzvanya data nechikamu che1.31, uye XZ nechikamu che1.64. Uyezve, OpenZL yakapfuura zstd mukumhanyisa kumhanya nechikamu che2 (203 MB/s maringe ne115 MB/s), uye XZ nechikamu che65 (203 MB/s maringe ne3.1 MB/s). Decompression muOpenZL yakanonoka zvishoma pane zstd (822 MB/s maringe ne890 MB/s) uye 27 times nekukurumidza kupfuura XZ.

 OpenZL compression system, pamberi peZstd uye XZ mukumhanya uye kumanikidza kweiyo data yakarongeka

OpenZL haisi general-chinangwa algorithm uye inongoratidza mibairo yakanaka yedata ine inozivikanwa chimiro. Kushanda kweOpenZL kunosanganisira kugadzirika kugadzira paki zvichibva pane yakapihwa tsananguro yedata. Izvi zvinogadzira compression kodhi yakagadziridzwa kune yakatarwa data fomati. A universal unpacker, inoenderana neese anogadzirwa packers, inoshandiswa pakudzikisira.

Kurongedza uye kuburitsa kunoitwa pachishandiswa chinhu chimwe chete, "zli," kana libopenzl raibhurari. Iyo data data inotsanangurwa nenzira yemaprofiles. Izvi zvinosanganisira seti yemaprofayiri akafanotsanangurwa anotsanangura akajairwa ekuchengetedza mafomati. Semuenzaniso, chimiro che CSV fomati kana data yakachengetwa se64-bit array. Kudzvanya kuri nyore sekusarudza chimiro chine "zli list-profiles" command uye kutanga iyo compression maitiro ne "zli compress --profile profile_name" command. Kuti usunungure, ingomhanya "zli decompress."

Kune mafomati chaiwo, chimiro chetsika chinofanirwa kugadzirwa uchishandisa "zli chitima" murairo, unozivisa mapatani mu data uye unoburitsa chimiro chine dhizaini yekumanikidza. Uchishandisa iyo "--pareto-frontier" sarudzo, iyo yakagadzirwa chimiro inogona kuvandudzwa kuti ikurumidze kudzvanya kana decompression, nemutengo wekumanikidza. Nyore Dhata Tsanangudzo Mutauro (SDDL) inogona kushandiswa kutsanangura mafomati akaoma ane nested zvimiro uye kutsanangura marongero e data mafomati mukati mezvimiro.

Iyo nzira yekugadzira yakanakisa packers yakavakirwa pane seti yemaprimitive encoder, imwe neimwe inonyatso shanda kune yakatarwa data mhando uye kutevedzana. Nekumanikidzana, inotungamirwa acyclic data processing graph inoumbwa, ine macodecs semanodhi uye dhata akasiyana mune yakagadziriswa fomati semipendero. Zvichienderana nerudzi rwekupinza data, ketani yemacodecs inosarudzwa inodzvanya zvakanyanya iyo inouya data chinhu. Nekurongeka uku, musoro wefaira unomanikidzwa uchishandisa imwe codec, nhamba yedata data uchishandisa yechipiri codec, incrementing counter field uchishandisa codec yechitatu, uye tambo yedata data uchishandisa yechina codec.

 OpenZL compression system, pamberi peZstd uye XZ mukumhanya uye kumanikidza kweiyo data yakarongeka


Source: opennet.ru

Voeg