Mayelana nendlela eyinqaba yokonga isikhala se-hard disk

Omunye umsebenzisi ufuna ukubhala ucezu olusha lwedatha kusigcinalwazi, kodwa akanaso isikhala esanele samahhala sokwenza lokhu. Futhi angifuni ukususa lutho, ngoba "yonke into ibaluleke kakhulu futhi iyadingeka." Futhi yini okufanele siyenze ngayo?

Akekho onale nkinga. Kukhona ama-terabytes olwazi kuma-hard drive ethu, futhi leli nani alithambekele ekuncipheni. Kodwa ihluke kangakanani? Ekugcineni, wonke amafayela angamasethi nje wamabhithi obude obuthile futhi, cishe, elisha alihlukile kakhulu kunalelo asevele agciniwe.

Kuyacaca ukuthi ukucinga izingcezu zolwazi oseluvele lugcinwe ku-hard drive, uma kungenjalo ukwehluleka, okungenani akuwona umsebenzi ophumelelayo. Ngakolunye uhlangothi, uma umehluko mncane, ungawulungisa kancane...

Mayelana nendlela eyinqaba yokonga isikhala se-hard disk

TL;DR - umzamo wesibili wokukhuluma ngendlela eyinqaba yokuthuthukisa idatha usebenzisa amafayela e-JPEG, manje asesimweni esiqondakala kakhudlwana.

Mayelana nezingcezu nomehluko

Uma uthatha izingcezu ezimbili zedatha ezingahleliwe, kusho ukuthi ngokwesilinganiso uhhafu wamabhithi aqukethwe uqondana. Ngempela, phakathi kwezakhiwo ezingaba khona zepheya ngalinye ('00, 01, 10, 11β€²), ncamashi uhhafu unamanani afanayo, yonke into ilula lapha.

Kodwa-ke, uma nje sithatha amafayela amabili futhi silingana nelinye kwelesibili, sizobe sesilahlekelwa elilodwa lawo. Uma silondoloza izinguquko, sizovele sisungule kabusha i-delta encoding, ekhona kahle ngaphandle kwethu, nakuba ngokuvamile ingasetshenziselwa izinjongo ezifanayo. Singazama ukushumeka ukulandelana okuncane kokukhulu, kodwa noma kunjalo sisengozini yokulahlekelwa amasegimenti abalulekile edatha uma siyisebenzisa ngokunganaki ngayo yonke into.

Phakathi kwalokhu futhi yini umehluko ungaqedwa? Hhayi-ke, okuwukuthi, ifayela elisha elibhalwe umsebenzisi liwukulandelana nje kwamabhithi, esingakwazi ukwenza lutho ngawo ngokwalo. Khona-ke udinga nje ukuthola izingcezu ezinjalo ku-hard drive ukuthi zingashintshwa ngaphandle kokugcina umehluko, ukuze ukwazi ukusinda ekulahlekelweni kwabo ngaphandle kwemiphumela emibi. Futhi kunengqondo ukushintsha hhayi nje ifayela eliku-FS uqobo, kodwa nolunye ulwazi olungazweli kakhulu ngaphakathi kuyo. Kodwa iyiphi futhi kanjani?

Izindlela zokufaka

Amafayela acindezelwe alahlekile ayasiza. Wonke lawa ma-jpegs, ama-mp3s nezinye, nakuba ukucindezelwa okulahlekile, kuqukethe inqwaba yezingcezu ezingashintshwa ngokuphephile. Kungenzeka ukusebenzisa amasu athuthukile aguqula ngokungabonakali izingxenye zawo ezigabeni ezahlukahlukene zombhalo wekhodi. Linda. Amasu athuthukile... ukuguqulwa okungabonakali... okukodwa kokunye... kucishe kufane i-steganography!

Ngempela, ukushumeka ulwazi olulodwa kolunye kukhumbuza izindlela zakhe ngokungafani nanoma yini enye. Kungihlaba umxhwele futhi ukungabonakali kwezinguquko ezenziwe ezinzwa zomuntu. Lapho izindlela zihlukana khona kuyimfihlo: umsebenzi wethu wehlela kumsebenzisi ukuthi afake ulwazi olwengeziwe kusigcinalwazi sakhe; kuzomlimaza kuphela. Uzokhohlwa futhi.

Ngakho-ke, nakuba singazisebenzisa, kudingeka senze izinguquko ezithile. Futhi-ke ngizobatshela futhi ngibabonise ngisebenzisa isibonelo sezinye zezindlela ezikhona kanye nefomethi yefayela evamile.

Mayelana nezimpungushe

Uma uyikhama ngempela, iyinto ecindezela kakhulu emhlabeni. Yebo, sikhuluma ngamafayela e-JPEG. Akukhona nje kuphela ukuthi kukhona amathani wamathuluzi nezindlela ezikhona zokushumeka idatha kuyo, kodwa iyifomethi yezithombe edume kakhulu kule planethi.

Mayelana nendlela eyinqaba yokonga isikhala se-hard disk

Nokho, ukuze ungazibandakanyi ekuzaleni izinja, udinga ukukhawulela insimu yakho yomsebenzi kumafayela ale fomethi. Akekho othanda izikwele ze-monochrome ezivela ngenxa yokucindezelwa ngokweqile, ngakho-ke udinga ukuzikhawulela ekusebenzeni ngefayela eselivele licindezelwe, ukugwema ukubhala kabusha. Ikakhulukazi, ngama-coefficients aphelele, asala ngemva kokusebenza okubophezelekile ekulahlekeni kwedatha - i-DCT nokwandiswa kwenani, okuboniswa kahle ohlelweni lombhalo wekhodi (sibonga i-wiki ye-Bauman National Library):
Mayelana nendlela eyinqaba yokonga isikhala se-hard disk

Kunezindlela eziningi ezingenzeka zokuthuthukisa amafayela e-jpeg. Kukhona ukwenza kahle okungalahleki (jpegtran), kukhona ukwenza kahle "akukho ukulahlekelwa", empeleni okunikela ngokunye, kodwa asinandaba nabo. Phela, uma umsebenzisi elungele ukushumeka ulwazi olulodwa kolunye ukuze andise isikhala samahhala sediski, khona-ke kungenzeka ukuthi walungisa izithombe zakhe kudala, noma akafuni ukwenza lokhu ngenxa yokwesaba ukulahlekelwa ikhwalithi.

F5

Umndeni wonke wama-algorithms ulingana nalezi zimo, ongazijwayeza kulesi sethulo esihle. Okuthuthuke kakhulu kubo yi-algorithm F5 ngu-Andreas Westfeld, esebenza ngama-coefficients engxenye yokukhanya, njengoba iso lomuntu lizwela kancane ezinguqukweni zalo. Ngaphezu kwalokho, isebenzisa indlela yokushumeka esekelwe kumbhalo wekhodi we-matrix, okwenza kube nokwenzeka ukwenza izinguquko ezimbalwa lapho ushumeka inani elifanayo lolwazi, usayizi wesiqukathi esisetshenzisiwe ube mkhulu.

Izinguquko ngokwazo zibilisa ekunciphiseni inani eliphelele lama-coefficients ngomunye ngaphansi kwezimo ezithile (okungukuthi, hhayi njalo), okukuvumela ukuthi usebenzise i-F5 ukuze ukwandise ukugcinwa kwedatha ku-hard drive yakho. Iphuzu liwukuthi i-coefficient ngemva koshintsho olunjalo cishe izothatha amabhithi ambalwa ngemva kombhalo wekhodi ka-Huffman ngenxa yokusatshalaliswa kwezibalo kwamanani ku-JPEG, futhi oziro abasha bazonikeza inzuzo lapho befaka ikhodi kusetshenziswa i-RLE.

Ukuguqulwa okudingekayo kubilisa ekuqedeni ingxenye ebophezelekile ekusithekeni (ukuhlelwa kabusha kwephasiwedi), okusindisa izinsiza nesikhathi sokwenza, nokwengeza indlela yokusebenza ngamafayela amaningi esikhundleni selilodwa ngesikhathi. Umfundi cishe ngeke abe nentshisekelo ngenqubo yoshintsho ngemininingwane eyengeziwe, ngakho-ke ake siqhubekele encazelweni yokuqaliswa.

I-high Tech

Ukukhombisa ukuthi le ndlela isebenza kanjani, ngisebenzise indlela ku-C emsulwa futhi ngenza izinto eziningi ezinhle mayelana nesivinini sokubulala nenkumbulo (awukwazi ukucabanga ukuthi lezi zithombe zinesisindo esingakanani ngaphandle kokucindezelwa, nangaphambi kwe-DCT). I-Cross-platform ifinyelelwe kusetshenziswa inhlanganisela yemitapo yolwazi libjpeg, pcre ΠΈ i-tinydir, esibabonga ngakho. Konke lokhu kuhlanganiswa 'ukwenza', ngakho abasebenzisi beWindows bafuna ukuzifakela i-Cygwin ukuze bayihlole, noma babhekane ne-Visual Studio nemitapo yolwazi bebodwa.

Ukuqaliswa kuyatholakala ngendlela ye-console utility kanye nomtapo wolwazi. Labo abanentshisekelo bangathola okwengeziwe ngokusebenzisa lokhu kokugcina ku-readme endaweni yokugcina ku-Github, isixhumanisi engizonamathisela kuso ekupheleni kokuthunyelwe.

Ungayisebenzisa kanjani?

Ngokucophelela. Izithombe ezisetshenziselwa ukupakishwa zikhethwa ngokusesha kusetshenziswa isisho esivamile kumkhombandlela wempande onikeziwe. Lapho sekuqediwe, amafayela angasuswa, aqanjwe kabusha futhi akopishwe ngokuthanda kwawo ngaphakathi kwemingcele yawo, aguqule ifayela namasistimu okusebenza, njll. Nokho, kufanele uqaphele kakhulu futhi ungashintshi okuqukethwe ngaleso sikhathi nganoma iyiphi indlela. Ukulahlekelwa ngisho nenani elincane kungenza kungenzeki ukubuyisela ulwazi.

Lapho usuqedile, insiza ishiya ifayela elikhethekile lengobo yomlando eliqukethe lonke ulwazi oludingekayo ukuze kukhishwe, kuhlanganise nedatha emayelana nezithombe ezisetshenzisiwe. Ngokwayo, inesisindo esingamakhilobhayithi ambalwa futhi ayinawo umthelela obalulekile esikhaleni sediski esithathiwe.

Ungakwazi ukuhlaziya umthamo ongase ube khona usebenzisa ifulegi elithi '-a': './f5ar -a [ifolda yosesho] [Inkulumo evamile ehambisana ne-Perl]'. Ukupakisha kwenziwa ngomyalo othi './f5ar -p [ifolda yosesho] [Inkulumo evamile ehambisana ne-Perl] [ifayela elipakishiwe] [igama lengobo yomlando]', futhi iqaqa ngo-'./f5ar -u [ifayela le-archive] [igama lefayela elitholiwe ]'.

Ukuboniswa komsebenzi

Ukukhombisa ukusebenza kahle kwendlela, ngilayishe iqoqo lezithombe ezingu-225 zamahhala zezinja ezivela enkonzweni. Unsplash futhi kutholwe emibhalweni i-pdf enkulu yamamitha angu-45 wevolumu yesibili Art of Programming I-Knuta.

Ukulandelana kulula kakhulu:

$ du -sh knuth.pdf dogs/
44M knuth.pdf
633M dogs/

$ ./f5ar -p dogs/ .*jpg knuth.pdf dogs.f5ar
Reading compressing file... ok
Initializing the archive... ok
Analysing library capacity... done in 17.0s
Detected somewhat guaranteed capacity of 48439359 bytes
Detected possible capacity of upto 102618787 bytes
Compressing... done in 39.4s
Saving the archive... ok

$ ./f5ar -u dogs/dogs.f5ar knuth_unpacked.pdf
Initializing the archive... ok
Reading the archive file... ok
Filling the archive with files... done in 1.4s
Decompressing... done in 21.0s
Writing extracted data... ok

$ sha1sum knuth.pdf knuth_unpacked.pdf
5bd1f496d2e45e382f33959eae5ab15da12cd666 knuth.pdf
5bd1f496d2e45e382f33959eae5ab15da12cd666 knuth_unpacked.pdf

$ du -sh dogs/
551M dogs/

Izithombe-skrini zabalandeli

Mayelana nendlela eyinqaba yokonga isikhala se-hard disk

Ifayela elingapakishiwe lingakwazi futhi kusafanele lifundwe:

Mayelana nendlela eyinqaba yokonga isikhala se-hard disk

Njengoba ubona, kusukela ku-633 + 36 == 669 megabytes yedatha ku-hard drive, sifinyelele ku-551 emnandi kakhulu. ukucindezelwa okungalahleki okulandelayo: ukunciphisa okukodwa nokukodwa β€œ kungasika amabhayithi ambalwa efayeleni lokugcina. Nokho, lokhu kusewukulahleka kwedatha, nakuba kuncane kakhulu, okuzodingeka ukubekezelele.

Ngenhlanhla, azibonakali neze ngeso. Ngaphansi kwe-spoiler (njengoba i-habrastorage ingakwazi ukuphatha amafayela amakhulu), umfundi angahlola umehluko kokubili ngeso nokuqina kwawo, atholwe ngokukhipha amanani engxenye eshintshiwe kweyokuqala: okwangempela, ngolwazi ngaphakathi, umehluko (umbala obumnyama, umehluko omncane ebhulokhini).

Esikhundleni isiphetho

Uma ucabangela zonke lezi zinkinga, ukuthenga i-hard drive noma ukulayisha yonke into efwini kungase kubonakale njengesixazululo esilula kakhulu senkinga. Kodwa nakuba siphila esikhathini esihle kangaka manje, azikho iziqinisekiso zokuthi kusasa kusangenzeka ukuthi uye ku-inthanethi futhi ulayishe yonke idatha yakho eyengeziwe ndawana thize. Noma hamba uye esitolo uzithengele enye inkulungwane ye-terabyte hard drive. Kodwa ungasebenzisa njalo izindlu ezikhona.

-> GitHub

Source: www.habr.com

Engeza amazwana