Ukuthola idatha kumathebula e-XtraDB ngaphandle kwefayela lesakhiwo kusetshenziswa ukuhlaziywa kwe-byte-by-byte kwefayela le-ibd

Ukuthola idatha kumathebula e-XtraDB ngaphandle kwefayela lesakhiwo kusetshenziswa ukuhlaziywa kwe-byte-by-byte kwefayela le-ibd

prehistory

Kwenzeka ukuthi iseva ihlaselwe igciwane le-ransomware, okwathi, “ngengozi yenhlanhla,” ingxenye yashiya amafayela .ibd (amafayela edatha eluhlaza amathebula e-innodb) engakathintwa, kodwa ngesikhathi esifanayo ibethelwe ngokuphelele amafayela we-.fpm ( hlela amafayela). Kulesi simo, i-.idb ingahlukaniswa ngokuthi:

  • kungaphansi kokubuyiselwa ngamathuluzi ajwayelekile nemihlahlandlela. Ezimweni ezinjalo, kukhona okuhle kakhulu iba;
  • amathebula abethelwe kancane. Ikakhulukazi lawa amatafula amakhulu, okuthi (njengoba ngiqonda) abahlaseli bangabi nayo i-RAM eyanele yokubethela okugcwele;
  • Yebo, amathebula abethelwe ngokugcwele angeke abuyiselwe.

Kube nokwenzeka ukunquma ukuthi iyiphi inketho amathebula angawakho ngokumane uyivule kunoma yimuphi umhleli wombhalo ngaphansi kokufaka ikhodi oyifunayo (esimweni sami yi-UTF8) futhi ngokumane ubuke ifayela ukuze kube khona izinkambu zombhalo, isibonelo:

Ukuthola idatha kumathebula e-XtraDB ngaphandle kwefayela lesakhiwo kusetshenziswa ukuhlaziywa kwe-byte-by-byte kwefayela le-ibd

Futhi, ekuqaleni kwefayela ungabheka inani elikhulu lamabhayithi angu-0, futhi amagciwane asebenzisa i-algorithm yokubethela vimba (okuvame kakhulu) kuvame ukubathinta nawo.
Ukuthola idatha kumathebula e-XtraDB ngaphandle kwefayela lesakhiwo kusetshenziswa ukuhlaziywa kwe-byte-by-byte kwefayela le-ibd

Endabeni yami, abahlaseli bashiye intambo ye-4-byte (1, 0, 0, 0) ekugcineni kwefayela ngalinye elibethelwe, okwenza umsebenzi ube lula. Ukuze useshele amafayela angenalo igciwane, iskripthi sanele:

def opened(path):
    files = os.listdir(path)
    for f in files:
        if os.path.isfile(path + f):
            yield path + f

for full_path in opened("C:somepath"):
    file = open(full_path, "rb")
    last_string = ""
    for line in file:
        last_string = line
        file.close()
    if (last_string[len(last_string) -4:len(last_string)]) != (1, 0, 0, 0):
        print(full_path)

Ngakho, kwatholakala amafayela ohlobo lokuqala. Okwesibili kuhilela umsebenzi omningi wezandla, kodwa okwatholakala kwase kwanele. Konke kuzohamba kahle, kodwa udinga ukwazi isakhiwo esinembe ngokuphelele futhi (kunjalo) kwavela icala lokuthi kwadingeka ngisebenze ngetafula elishintsha njalo. Akekho okhumbule ukuthi uhlobo lwenkambu lushintshiwe noma ikholomu entsha yengeziwe.

IWilds City, ngeshwa, ayikwazanga ukusiza ecaleni elinje, yingakho kubhalwa lesi sihloko.

Thola iphuzu

Kunesakhiwo setafula kusukela ezinyangeni ezi-3 ezedlule esingaqondani nesamanje (okungenzeka inkambu eyodwa, futhi mhlawumbe nangaphezulu). Isakhiwo sethebula:

CREATE TABLE `table_1` (
    `id` INT (11),
    `date` DATETIME ,
    `description` TEXT ,
    `id_point` INT (11),
    `id_user` INT (11),
    `date_start` DATETIME ,
    `date_finish` DATETIME ,
    `photo` INT (1),
    `id_client` INT (11),
    `status` INT (1),
    `lead__time` TIME ,
    `sendstatus` TINYINT (4)
); 

Kulokhu, udinga ukukhipha:

  • id_point int(11);
  • id_user int(11);
  • date_start DATETIME;
  • date_finish DATETIME.

Ukuthola, kusetshenziswe ukuhlaziywa kwe-byte-by-byte kwefayela elithi .ibd, okulandelwa ukuliguqulela kufomu elifundeka kakhulu. Njengoba ukuze sithole esikudingayo, sidinga kuphela ukuhlaziya izinhlobo zedatha ezifana ne-int ne-datatime, isihloko sizozichaza kuphela, kodwa ngezinye izikhathi sizophinde sibhekisele kwezinye izinhlobo zedatha, ezingasiza kwezinye izigameko ezifanayo.

Inkinga 1: izinkambu ezinezinhlobo ze-DATETIME kanye ne-TEXT zinamanani angu-NULL, futhi avele eqiwe efayeleni, ngenxa yalokhu, akwenzekanga ukunquma isakhiwo okufanele sibuyiselwe esimweni sami. Kumakholomu amasha, inani elizenzakalelayo alisebenzi, futhi ingxenye yomsebenzi ingase ilahleke ngenxa yesilungiselelo esithi innodb_flush_log_at_trx_commit = 0, ngakho kuzodingeka kusetshenziswe isikhathi esengeziwe ukuze kunqunywe ukwakheka.

Inkinga 2: kufanele kuqashelwe ukuthi imigqa esuswe nge-DELETE izoba sefayilini le-ibd yonke, kodwa nge-ALTER TABLE isakhiwo sayo ngeke sibuyekezwe. Ngenxa yalokho, ukwakheka kwedatha kungahluka kusukela ekuqaleni kwefayela kuye ekupheleni kwalo. Uma uvamise ukusebenzisa i-OPTIMIZE TABLE, cishe ngeke uhlangabezane nenkinga enjalo.

Nakani, inguqulo ye-DBMS ithinta indlela idatha egcinwa ngayo, futhi lesi sibonelo singase singasebenzi kwezinye izinguqulo ezinkulu. Endabeni yami, kusetshenziswe inguqulo yewindi ye-mariadb 10.1.24. Futhi, nakuba ku-mariadb usebenza namatafula e-InnoDB, empeleni anjalo I-XtraDB, okungafaki ukusetshenziswa kwendlela nge-InnoDB mysql.

Ukuhlaziywa kwefayela

Ku-python, uhlobo lwedatha amabhayithi() ibonisa idatha ye-Unicode esikhundleni sesethi evamile yezinombolo. Nakuba ungakwazi ukubuka ifayela kuleli fomu, ukuze kube lula ungakwazi ukuguqula amabhayithi abe yinombolo ngokuguqula uhlu lwebhayithi lube uhlu olujwayelekile (uhlu(isibonelo_ibhayithi_array)). Kunoma yikuphi, zombili izindlela zifanelekile ukuhlaziya.

Ngemva kokubheka amafayela amaningi e-ibd, ungathola okulandelayo:

Ukuthola idatha kumathebula e-XtraDB ngaphandle kwefayela lesakhiwo kusetshenziswa ukuhlaziywa kwe-byte-by-byte kwefayela le-ibd

Ngaphezu kwalokho, uma uhlukanisa ifayela ngalawa magama angukhiye, uzothola kakhulu amabhlogo wedatha. Sizosebenzisa i-infimum njenge-divisor.

table = table.split("infimum".encode())

Ukuqaphela okuthakazelisayo: kumatafula anenani elincane ledatha, phakathi kwe-infimum ne-supremum kukhona isikhombisi senani lemigqa kubhulokhi.

Ukuthola idatha kumathebula e-XtraDB ngaphandle kwefayela lesakhiwo kusetshenziswa ukuhlaziywa kwe-byte-by-byte kwefayela le-ibd — ithebula lokuhlola elinomugqa woku-1

Ukuthola idatha kumathebula e-XtraDB ngaphandle kwefayela lesakhiwo kusetshenziswa ukuhlaziywa kwe-byte-by-byte kwefayela le-ibd - Ithebula lokuhlola elinemigqa emi-2

Ithebula lohlu lomugqa[0] lingeqiwa. Ngemva kokuyibheka, angikwazanga ukuthola idatha yethebula elingahluziwe. Ngokunokwenzeka, leli bhulokhi lisetshenziselwa ukugcina izinkomba nokhiye.
Ukuqala ngethebula[1] futhi uyihumushele kuhlu lwezinombolo, usungabona amaphethini athile, okungukuthi:

Ukuthola idatha kumathebula e-XtraDB ngaphandle kwefayela lesakhiwo kusetshenziswa ukuhlaziywa kwe-byte-by-byte kwefayela le-ibd

Lawa amanani e-int agcinwe kuyunithi yezinhlamvu. Ibhayithi yokuqala ikhombisa ukuthi inombolo iphozithivu noma inegethivu. Endabeni yami, zonke izinombolo zilungile. Kusukela kumabhayithi angu-3 asele, ungakwazi ukunquma inombolo usebenzisa umsebenzi olandelayo. Iskriphthi:

def find_int(val: str):  # example '128, 1, 2, 3'
    val = [int(v) for v in  val.split(", ")]
    result_int = val[1]*256**2 + val[2]*256*1 + val[3]
    return result_int

Isibonelo, 128, 0, 0, 1 = 1, noma 128, 0, 75, 108 = 19308.
Ithebula belinokhiye oyinhloko onokunyuka okuzenzakalelayo, futhi lingatholakala lapha

Ukuthola idatha kumathebula e-XtraDB ngaphandle kwefayela lesakhiwo kusetshenziswa ukuhlaziywa kwe-byte-by-byte kwefayela le-ibd

Ngemva kokuqhathanisa idatha esuka kumathebula okuhlola, kwavezwa ukuthi into ethi DATETIME iqukethe amabhayithi angu-5 futhi iqale ngo-153 (okungenzeka kakhulu abonise izikhawu zonyaka). Njengoba ububanzi be-DATTIME bungu-'1000-01-01' ukuya ku-'9999-12-31', ngicabanga ukuthi inani lamabhayithi lingahluka, kodwa esimweni sami, idatha iwela esikhathini esisuka ku-2016 kuya ku-2019, ngakho-ke sizothatha ukuthi amabhayithi angu-5 anele.

Ukunquma isikhathi ngaphandle kwemizuzwana, imisebenzi elandelayo yabhalwa. Iskriphthi:

day_ = lambda x: x % 64 // 2  # {x,x,X,x,x }

def hour_(x1, x2):  # {x,x,X1,X2,x}
    if x1 % 2 == 0:
        return x2 // 16
    elif x1 % 2 == 1:
        return x2 // 16 + 16
    else:
        raise ValueError

min_ = lambda x1, x2: (x1 % 16) * 4 + (x2 // 64)  # {x,x,x,X1,X2}

Kwakungenakwenzeka ukubhala umsebenzi osebenzayo wonyaka nenyanga, ngakho kwadingeka ngiwugqekeze. Iskriphthi:

ym_list = {'2016, 1': '153, 152, 64', '2016, 2': '153, 152, 128', 
           '2016, 3': '153, 152, 192', '2016, 4': '153, 153, 0',
           '2016, 5': '153, 153, 64', '2016, 6': '153, 153, 128', 
           '2016, 7': '153, 153, 192', '2016, 8': '153, 154, 0', 
           '2016, 9': '153, 154, 64', '2016, 10': '153, 154, 128', 
           '2016, 11': '153, 154, 192', '2016, 12': '153, 155, 0',
           '2017, 1': '153, 155, 128', '2017, 2': '153, 155, 192', 
           '2017, 3': '153, 156, 0', '2017, 4': '153, 156, 64',
           '2017, 5': '153, 156, 128', '2017, 6': '153, 156, 192',
           '2017, 7': '153, 157, 0', '2017, 8': '153, 157, 64',
           '2017, 9': '153, 157, 128', '2017, 10': '153, 157, 192', 
           '2017, 11': '153, 158, 0', '2017, 12': '153, 158, 64', 
           '2018, 1': '153, 158, 192', '2018, 2': '153, 159, 0',
           '2018, 3': '153, 159, 64', '2018, 4': '153, 159, 128', 
           '2018, 5': '153, 159, 192', '2018, 6': '153, 160, 0',
           '2018, 7': '153, 160, 64', '2018, 8': '153, 160, 128',
           '2018, 9': '153, 160, 192', '2018, 10': '153, 161, 0', 
           '2018, 11': '153, 161, 64', '2018, 12': '153, 161, 128',
           '2019, 1': '153, 162, 0', '2019, 2': '153, 162, 64', 
           '2019, 3': '153, 162, 128', '2019, 4': '153, 162, 192', 
           '2019, 5': '153, 163, 0', '2019, 6': '153, 163, 64',
           '2019, 7': '153, 163, 128', '2019, 8': '153, 163, 192',
           '2019, 9': '153, 164, 0', '2019, 10': '153, 164, 64', 
           '2019, 11': '153, 164, 128', '2019, 12': '153, 164, 192',
           '2020, 1': '153, 165, 64', '2020, 2': '153, 165, 128',
           '2020, 3': '153, 165, 192','2020, 4': '153, 166, 0', 
           '2020, 5': '153, 166, 64', '2020, 6': '153, 1, 128',
           '2020, 7': '153, 166, 192', '2020, 8': '153, 167, 0', 
           '2020, 9': '153, 167, 64','2020, 10': '153, 167, 128',
           '2020, 11': '153, 167, 192', '2020, 12': '153, 168, 0'}

def year_month(x1, x2):  # {x,X,X,x,x }

    for key, value in ym_list.items():
        key = [int(k) for k in key.replace("'", "").split(", ")]
        value = [int(v) for v in value.split(", ")]
        if x1 == value[1] and x2 // 64 == value[2] // 64:
            return key
    return 0, 0

Nginesiqiniseko sokuthi uma uchitha isikhathi esingakanani, lokhu kungaqondi kahle kungalungiswa.
Okulandelayo, umsebenzi obuyisela into yedethi kusukela kuyunithi yezinhlamvu. Iskriphthi:

def find_data_time(val:str):
    val = [int(v) for v in val.split(", ")]
    day = day_(val[2])
    hour = hour_(val[2], val[3])
    minutes = min_(val[3], val[4])
    year, month = year_month(val[1], val[2])
    return datetime(year, month, day, hour, minutes)

Iphethwe ukuthola amanani aphindaphindwa njalo ukusuka ku-int, int, isikhathi sosuku, isikhathi sosuku Ukuthola idatha kumathebula e-XtraDB ngaphandle kwefayela lesakhiwo kusetshenziswa ukuhlaziywa kwe-byte-by-byte kwefayela le-ibd, kubukeka sengathi yilokhu okudingayo. Ngaphezu kwalokho, ukulandelana okunjalo akuphindwa kabili emgqeni ngamunye.

Ngokusebenzisa isisho esivamile, sithola idatha edingekayo:

fined = re.findall(r'128, d*, d*, d*, 128, d*, d*, d*, 153, 1[6,5,4,3]d, d*, d*, d*, 153, 1[6,5,4,3]d, d*, d*, d*', int_array)

Sicela uqaphele ukuthi lapho usesha usebenzisa le nkulumo, ngeke kwenzeke ukunquma amanani angu-NULL ezinkambu ezidingekayo, kodwa kimina lokhu akubalulekile. Bese sidlula kulokho esikuthole ku-loop. Iskriphthi:

result = []
for val in fined:
    pre_result = []
    bd_int  = re.findall(r"128, d*, d*, d*", val)
    bd_date= re.findall(r"(153, 1[6,5,4,3]d, d*, d*, d*)", val)
    for it in bd_int:
        pre_result.append(find_int(bd_int[it]))
    for bd in bd_date:
        pre_result.append(find_data_time(bd))
    result.append(pre_result)

Empeleni, yilokho kuphela, idatha evela kuhlu lwemiphumela idatha esiyidingayo. ###PS.###
Ngiyaqonda ukuthi le ndlela ayifanele wonke umuntu, kodwa umgomo oyinhloko we-athikili ukusheshisa isenzo kunokuxazulula zonke izinkinga zakho. Ngicabanga ukuthi isixazululo esilungile kungaba ukuqala ukuzifundela wena ikhodi yomthombo i-mariadb, kodwa ngenxa yesikhathi esilinganiselwe, indlela yamanje ibonakala ishesha kakhulu.

Kwezinye izimo, ngemva kokuhlaziya ifayela, uzokwazi ukunquma isakhiwo esiseduze futhi usibuyisele usebenzisa enye yezindlela ezijwayelekile ezivela kuzixhumanisi ezingenhla. Lokhu kuzolunga kakhulu futhi kubangele izinkinga ezimbalwa.

Source: www.habr.com

Engeza amazwana