High-Speed ​​​​Fail-Safe Compression (Txuas ntxiv)

Kab lus no yog qhov thib ob hauv lub ntsiab lus ntawm kev kub ceev cov ntaub ntawv compression. Thawj tsab xov xwm tau piav qhia txog lub compressor ua haujlwm ntawm qhov ceev ntawm 10 GB / sec. ib tug processor core (tsawg kawg compression, RTT-Min).

Cov compressor no twb tau siv nyob rau hauv cov cuab yeej ntawm forensic duplicators rau high-ceev compression ntawm cia xov xwm dumps thiab txhim khu lub zog ntawm cryptography; nws kuj yuav siv tau rau compress dluab ntawm virtual machines thiab RAM swap ntaub ntawv thaum txuag lawv nyob rau hauv high-speed. SSD tsav.

Thawj tsab xov xwm kuj tau tshaj tawm txoj kev txhim kho ntawm compression algorithm rau compressing thaub qab luam theej ntawm HDD thiab SSD disk drives (nruab nrab compression, RTT-Mid) nrog kev txhim kho cov ntaub ntawv compression tsis zoo. Los ntawm tam sim no, lub tshuab compressor tau npaj tiav thiab cov lus no yog hais txog nws.

Lub compressor uas siv lub RTT-Mid algorithm muab qhov sib piv compression piv rau cov qauv archivers xws li WinRar, 7-Zip, ua haujlwm hauv hom kev kub ceev. Nyob rau tib lub sijhawm, nws qhov kev khiav hauj lwm ceev yog tsawg kawg yog qhov kev txiav txim ntawm qhov loj dua.

Qhov ceev ntawm cov ntaub ntawv packing / unpacking yog ib qho tseem ceeb parameter uas txiav txim siab lub Scope ntawm daim ntawv thov ntawm compression technologies. Nws tsis zoo li leej twg yuav xav txog compressing ib terabyte ntawm cov ntaub ntawv ntawm qhov ceev ntawm 10-15 MegaBytes ib ob (qhov no yog raws nraim qhov ceev ntawm archivers nyob rau hauv tus qauv compression hom), vim hais tias nws yuav siv sij hawm yuav luag nees nkaum teev nrog ib tug tag nrho processor load. .

Ntawm qhov tod tes, tib lub terabyte tuaj yeem theej ntawm qhov nrawm ntawm qhov kev txiav txim ntawm 2-3Gigabytes ib thib ob hauv li kaum feeb.

Yog li ntawd, compression ntawm cov ntaub ntawv loj-ntim yog ib qho tseem ceeb yog tias nws tau ua ntawm qhov ceev tsis qis dua qhov ceev ntawm cov tswv yim / tso tawm tiag tiag. Rau cov tshuab niaj hnub no yog tsawg kawg 100 Megabytes ib ob.

Niaj hnub nimno compressors tuaj yeem tsim cov kev ceev no tsuas yog hauv "ceev" hom. Nws yog nyob rau hauv hom tam sim no uas peb yuav muab piv rau RTT-Mid algorithm nrog cov tsoos compressors.

Kev sib piv ntawm qhov tshiab compression algorithm

RTT-Mid compressor ua haujlwm raws li ib feem ntawm qhov kev xeem. Hauv daim ntawv thov "ua haujlwm" tiag tiag nws ua haujlwm sai dua, nws siv ntau txoj xov zoo thiab siv lub "ib txwm" compiler, tsis yog C #.

Txij li thaum cov compressors siv nyob rau hauv kev sib piv kev xeem yog tsim nyob rau hauv ntau lub hauv paus ntsiab lus thiab ntau hom ntawm cov ntaub ntawv compresses sib txawv, rau lub hom phiaj ntawm kev ntsuam xyuas, txoj kev ntsuas qhov nruab nrab kub nyob rau hauv lub tsev kho mob yog siv ...

Ib cov ntaub ntawv-los ntawm-sector dump cov ntaub ntawv ntawm lub logical disk nrog Windows 10 operating system tau tsim; qhov no yog qhov sib xyaw ua ke ntawm ntau cov ntaub ntawv tsim muaj nyob hauv txhua lub computer. Compressing cov ntaub ntawv no yuav tso cai rau koj los sib piv qhov ceev thiab qib ntawm compression ntawm cov algorithm tshiab nrog cov compressors tshaj plaws siv nyob rau hauv niaj hnub archivers.

Ntawm no yog cov ntaub ntawv pov tseg:

High-Speed ​​​​Fail-Safe Compression (Txuas ntxiv)

Cov ntaub ntawv pov tseg tau compressed siv PTT-Mid, 7-zip, thiab WinRar compressors. WinRar thiab 7-zip compressor tau teem rau qhov siab tshaj plaws.

Compressor khiav 7-zip:

High-Speed ​​​​Fail-Safe Compression (Txuas ntxiv)

Nws thauj cov processor los ntawm 100%, thaum qhov nruab nrab ceev ntawm kev nyeem cov ntaub ntawv qub yog li 60 MegaBytes / sec.

Compressor khiav Winrar:

High-Speed ​​​​Fail-Safe Compression (Txuas ntxiv)

Qhov xwm txheej zoo sib xws, cov khoom siv thauj khoom yuav luag 100%, qhov nruab nrab qhov kev nyeem ntawv ceev yog li 125 Megabytes / sec.

Raws li nyob rau hauv cov ntaub ntawv dhau los, qhov ceev ntawm lub archiver yog txwv los ntawm lub processor lub peev xwm.

Tam sim no qhov kev ntsuam xyuas compressor tau ua haujlwm RTT-Tsab:

High-Speed ​​​​Fail-Safe Compression (Txuas ntxiv)

Lub screenshot qhia tau hais tias lub processor yog loaded ntawm 50% thiab nyob twj ywm nyob rau hauv lub so ntawm lub sij hawm, vim hais tias tsis muaj nyob rau upload cov ntaub ntawv compressed. Cov ntaub ntawv upload disk (Disk 0) yuav luag tag nrho. Cov ntaub ntawv nyeem ceev (Disk 1) sib txawv heev, tab sis qhov nruab nrab ntau tshaj 200 MegaBytes / sec.

Qhov ceev ntawm compressor yog txwv nyob rau hauv cov ntaub ntawv no los ntawm lub peev xwm los sau cov ntaub ntawv compressed rau Disk 0.

Tam sim no lub compression piv ntawm cov txiaj ntsig archives:

High-Speed ​​​​Fail-Safe Compression (Txuas ntxiv)

High-Speed ​​​​Fail-Safe Compression (Txuas ntxiv)

High-Speed ​​​​Fail-Safe Compression (Txuas ntxiv)

Nws tuaj yeem pom tias RTT-Mid compressor tau ua txoj haujlwm zoo tshaj plaws ntawm compression; cov ntaub ntawv nws tsim yog 1,3 GigaBytes me dua WinRar archive thiab 2,1 GigaBytes me dua 7z archive.

Lub sijhawm siv los tsim cov ntaub ntawv:

  • 7-zip - 26 feeb 10 vib nas this;
  • WinRar - 17 feeb 40 vib nas this;
  • RTT-Nruab nrab - 7 feeb 30 vib nas this.

Yog li, txawm tias qhov kev sim, tsis yog qhov zoo tshaj plaws, siv RTT-Mid algorithm, muaj peev xwm tsim cov ntaub ntawv ntau dua li ob thiab ib nrab zaus sai dua, thaum lub archive tau ua kom tsawg dua li ntawm nws cov neeg sib tw ...

Cov neeg uas tsis ntseeg cov screenshots tuaj yeem tshawb xyuas lawv qhov tseeb ntawm lawv tus kheej. Qhov kev xeem no muaj nyob ntawm txuas, download thiab kos.

Tab sis tsuas yog ntawm cov txheej txheem nrog AVX-2 kev txhawb nqa, tsis muaj kev txhawb nqa rau cov lus qhia no lub compressor tsis ua haujlwm, thiab tsis txhob sim cov algorithm ntawm cov laus AMD processors, lawv qeeb ntawm kev ua tiav AVX cov lus qhia ...

Txoj kev compression siv

Lub algorithm siv ib txoj hauv kev rau indexing rov qab cov ntawv fragments hauv byte granularity. Txoj kev compression no tau paub ntev, tab sis tsis tau siv vim tias kev ua haujlwm sib txuam yog kim heev ntawm cov peev txheej tsim nyog thiab xav tau ntau lub sijhawm ntau dua li tsim cov phau ntawv txhais lus. Yog li RTT-Mid algorithm yog ib qho piv txwv classic ntawm kev txav "rov qab mus rau yav tom ntej" ...

PTT compressor siv lub cim tshwj xeeb kev sib tw nrhiav kev sib tw nrawm, uas tso cai rau peb kom ceev cov txheej txheem compression. Lub tshuab luam ntawv tus kheej, qhov no yog "kuv ntxim nyiam ...", "nws kim heev, vim tias nws yog xuas tes ua kiag li" (sau hauv assembler).

Kev tshawb nrhiav kev sib tw yog ua raws li ob theem kev xav tau: ua ntej, qhov pom ntawm "kos npe" ntawm qhov kev sib tw raug kuaj xyuas, thiab tsuas yog tom qab "kos npe" raug txheeb xyuas hauv qhov chaw no, cov txheej txheem txhawm rau txheeb xyuas qhov sib tw tiag tiag. pib.

Lub qhov rais tshawb nrhiav qhov sib tw muaj qhov loj me uas tsis tuaj yeem pom, nyob ntawm qib ntawm entropy hauv cov ntaub ntawv ua tiav. Rau tag nrho random (incompressible) cov ntaub ntawv nws muaj ib tug loj megabytes, rau cov ntaub ntawv nrog repetitions nws yog ib txwm loj dua ib megabyte.

Tab sis ntau cov ntaub ntawv niaj hnub no tsis tuaj yeem nkag siab thiab ua haujlwm siv cov cuab yeej siv tshuab scanner los ntawm lawv tsis muaj txiaj ntsig thiab khib nyiab, yog li lub scanner siv ob hom kev ua haujlwm. Ua ntej, ntu ntawm cov ntawv nyeem nrog qhov ua tau rov ua dua tau tshawb nrhiav; qhov haujlwm no kuj tau ua los ntawm kev siv cov txheej txheem probabilistic thiab ua tiav sai heev (ntawm qhov ceev ntawm 4-6 GigaBytes / sec). Cov cheeb tsam uas muaj peev xwm ua tau yog ua tiav los ntawm lub ntsiab scanner.

Index compression tsis muaj txiaj ntsig zoo, koj yuav tsum hloov qhov sib npaug ntawm qhov sib npaug nrog cov ntsuas ntsuas, thiab qhov ntsuas qhov ntsuas tau txo qis qhov sib piv.

Txhawm rau nce qhov sib piv compression, tsis tsuas yog ua kom tiav qhov sib tw ntawm cov hlua byte raug ntsuas, tab sis kuj yog ib feem, thaum txoj hlua muaj qhov sib txuam thiab tsis sib xws. Txhawm rau ua qhov no, hom ntawv ntsuas suav nrog daim npog ntsej muag uas qhia txog qhov sib txuam bytes ntawm ob lub thaiv. Rau qhov compression ntau dua, indexing yog siv los ua kom pom ntau qhov sib txuam ua ke mus rau qhov tam sim no thaiv.

Tag nrho cov no ua rau nws muaj peev xwm tau txais hauv PTT-Mid compressor ib qho kev sib piv piv rau cov compressors tau siv cov lus txhais lus, tab sis ua hauj lwm sai dua.

Ceev ntawm cov tshiab compression algorithm

Yog hais tias lub compressor ua haujlwm nrog kev siv tshwj xeeb ntawm cache nco (4 megabytes yuav tsum tau ib lub xov), ces qhov kev khiav hauj lwm ceev ntawm 700-2000 Megabytes / sec. ib tug processor core, nyob ntawm seb hom ntaub ntawv raug compressed thiab nyob ntawm me ntsis ntawm kev khiav hauj lwm zaus ntawm lub processor.

Nrog rau kev siv ntau txoj xov ntawm lub compressor, kev ua haujlwm tau zoo yog txiav txim siab los ntawm qhov loj ntawm qib peb cache. Piv txwv li, muaj 9 MegaBytes ntawm lub cim xeeb cache "nyob rau hauv lub nkoj", tsis muaj qhov taw tes hauv kev tshaj tawm ntau tshaj ob txoj xov txuas; qhov ceev yuav tsis nce los ntawm qhov no. Tab sis nrog lub cache ntawm 20 Megabytes, koj tuaj yeem khiav tsib txoj xov.

Tsis tas li ntawd, lub latency ntawm RAM yog ib qho tseem ceeb parameter uas txiav txim siab qhov ceev ntawm lub compressor. Lub algorithm siv random nkag mus rau OP, qee qhov tsis nkag mus rau hauv lub cim xeeb cache (kwv yees li 10%) thiab nws yuav tsum nyob twj ywm, tos cov ntaub ntawv los ntawm OP, uas txo cov kev ua haujlwm ceev.

Qhov tseem ceeb cuam tshuam rau kev ceev ntawm lub compressor thiab kev ua haujlwm ntawm cov ntaub ntawv tawm tswv yim / tso tawm. Kev thov rau OP los ntawm I / O thaiv kev thov rau cov ntaub ntawv los ntawm CPU, uas kuj txo cov compression ceev. Qhov teeb meem no tseem ceeb rau cov khoos phis tawj thiab cov desktops; rau cov servers nws tsis tshua muaj txiaj ntsig vim muaj ntau dua qhov system tsheb npav nkag mus tswj thiab ntau-channel RAM.

Thoob plaws hauv cov ntawv hauv tsab xov xwm peb tham txog compression; decompression tseem nyob sab nraud ntawm tsab xov xwm no txij li "txhua yam yog them rau hauv chocolate". Decompression yog nrawm dua thiab raug txwv los ntawm I / O ceev. Ib qho tseem ceeb ntawm lub cev hauv ib txoj xov yooj yim muab qhov nrawm nrawm ntawm 3-4 GB / sec.

Qhov no yog vim tsis muaj kev sib tw tshawb nrhiav kev ua haujlwm thaum lub sijhawm decompression, uas "noj" cov peev txheej tseem ceeb ntawm processor thiab cache nco thaum compression.

Kev ntseeg tau ntawm compressed cov ntaub ntawv cia

Raws li lub npe ntawm tag nrho cov chav kawm ntawm software uas siv cov ntaub ntawv compression (archivers) qhia, lawv tau tsim los khaws cov ntaub ntawv mus ntev, tsis yog rau xyoo, tab sis rau ntau pua xyoo thiab ntau txhiab xyoo ...

Thaum lub sijhawm khaws cia, cov ntaub ntawv khaws cia poob qee cov ntaub ntawv, ntawm no yog ib qho piv txwv:

High-Speed ​​​​Fail-Safe Compression (Txuas ntxiv)

Cov ntaub ntawv "analog" no yog ib txhiab xyoo, qee qhov tawg tau ploj, tab sis feem ntau cov ntaub ntawv yog "nyeem tau" ...

Tsis muaj lub luag haujlwm ntawm cov tuam txhab lag luam niaj hnub digital cov ntaub ntawv khaws cia thiab cov xov xwm digital rau lawv muab kev lees paub ua tiav cov ntaub ntawv kev nyab xeeb rau ntau tshaj 75 xyoo.
Thiab qhov no yog qhov teeb meem, tab sis qhov teeb meem ncua sij hawm, peb cov xeeb ntxwv yuav daws nws ...

Cov ntaub ntawv khaws cia cov tshuab tuaj yeem poob cov ntaub ntawv tsis yog tom qab 75 xyoo, cov ntaub ntawv yuam kev tuaj yeem tshwm sim txhua lub sijhawm, txawm tias thaum lawv kaw, lawv sim txo cov kev cuam tshuam no los ntawm kev siv redundancy thiab kho lawv nrog kev kho qhov yuam kev. Kev rov ua dua thiab kho cov tshuab tsis tuaj yeem rov qab tau cov ntaub ntawv ploj tas li, thiab yog tias lawv ua, tsis muaj kev lees paub tias kev ua haujlwm kho dua tshiab tau ua tiav kom raug.

Thiab qhov no kuj yog ib qho teeb meem loj, tab sis tsis yog ib qho kev ncua, tab sis tam sim no.

Niaj hnub nimno compressors siv rau archiving digital cov ntaub ntawv yog tsim los ntawm ntau yam kev hloov kho ntawm cov phau ntawv txhais lus txoj kev, thiab rau xws li archives poob ntawm ib daim ntaub ntawv yuav ua rau tuag taus; muaj txawm ib tug tsim lub sij hawm rau xws li ib tug teeb meem - ib tug "tawg" archive ...

Qhov tsis tshua muaj kev ntseeg siab ntawm kev khaws cov ntaub ntawv hauv archives nrog phau ntawv txhais lus compression yog txuam nrog cov qauv ntawm cov ntaub ntawv compressed. Cov ntaub ntawv nyob rau hauv xws li ib tug archive tsis muaj cov ntawv nyeem, cov naj npawb ntawm nkag nyob rau hauv phau ntawv txhais lus yog muab khaws cia rau ntawd, thiab cov phau ntawv txhais lus nws tus kheej yog dynamically hloov los ntawm cov ntaub ntawv compressed tam sim no. Yog hais tias ib qho archive fragment poob los yog corrupted, tag nrho cov tom ntej archive nkag tsis tuaj yeem txheeb xyuas los ntawm cov ntsiab lus lossis los ntawm qhov ntev ntawm qhov nkag hauv phau ntawv txhais lus, vim nws tsis paub meej tias phau ntawv txhais lus nkag mus rau tus lej twg.

Nws yog tsis yooj yim sua kom rov qab tau cov ntaub ntawv los ntawm xws li ib tug "tawg" archive.

RTT algorithm yog raws li kev txhim khu kev qha ntau ntawm kev khaws cov ntaub ntawv compressed. Nws siv txoj kev index ntawm accounting rau rov qab fragments. Txoj hauv kev no rau compression tso cai rau koj kom txo qis qhov tshwm sim ntawm distortion ntawm cov ntaub ntawv nyob rau hauv nruab nrab cia, thiab nyob rau hauv ntau rooj plaub yuav kho distortions uas tshwm sim thaum lub sij hawm cia cov ntaub ntawv.
Qhov no yog vim lub fact tias cov ntaub ntawv archive nyob rau hauv cov ntaub ntawv ntawm index compression muaj ob lub teb:

  • ib qhov chaw ntawv teb nrog rov seem muab tshem tawm ntawm nws;
  • index teb.

Daim teb index, uas yog qhov tseem ceeb rau cov ntaub ntawv rov qab, tsis yog qhov loj me thiab tuaj yeem muab duplicated rau cov ntaub ntawv khaws cia. Yog li ntawd, txawm tias ib feem ntawm cov ntawv sau los yog qhov ntsuas qhov ntsuas tau ploj, tag nrho lwm cov ntaub ntawv yuav raug rov qab los yam tsis muaj teeb meem, xws li hauv daim duab nrog "analog" nruab nrab.

Disadvantages ntawm algorithm

Tsis muaj qhov zoo tsis muaj qhov tsis zoo. Txoj kev index compression tsis compress luv luv rov ua ntu zus. Qhov no yog vim muaj kev txwv ntawm txoj kev ntsuas ntsuas. Indexs tsawg kawg yog 3 bytes hauv qhov loj thiab tuaj yeem nce mus txog 12 bytes loj. Yog tias qhov kev rov ua dua tau ntsib nrog qhov me me dua li qhov ntsuas tau piav qhia nws, ces nws tsis raug coj mus rau hauv tus account, txawm tias ntau npaum li cas cov lus rov qab tau kuaj pom hauv cov ntaub ntawv compressed.

Cov phau ntawv txhais lus ib txwm siv compression txoj kev zoo compresses ntau repetitions ntawm luv luv thiab yog li ntawd ua tau ib tug ntau dua compression piv dua index compression. Muaj tseeb, qhov no yog ua tiav vim lub siab load ntawm lub hauv paus processor; nyob rau hauv kev txiav txim rau txoj kev phau ntawv txhais lus yuav pib compress cov ntaub ntawv ntau npaum li cas index method, nws yuav tsum tau txo cov ntaub ntawv ua ceev mus rau 10-20 megabytes ib ob ntawm tiag. suav cov kev teeb tsa nrog tag nrho CPU load.

Cov kev ceev qis zoo li no tsis tsim nyog rau cov ntaub ntawv khaws cia niaj hnub no thiab muaj kev txaus siab "kev kawm" ntau dua li qhov ua tau zoo.

Cov qib ntawm cov ntaub ntawv compression yuav nce ntxiv hauv qhov kev hloov kho tom ntej ntawm RTT algorithm (RTT-Max), uas twb tau tsim lawm.

Yog li ntawd, raws li ib txwm, yuav tsum tau txuas ntxiv ...

Tau qhov twg los: www.hab.com

Ntxiv ib saib