Rov qab cov ntaub ntawv los ntawm XtraDB cov lus tsis muaj cov ntaub ntawv qauv siv byte-by-byte tsom xam ntawm ibd cov ntaub ntawv

Rov qab cov ntaub ntawv los ntawm XtraDB cov lus tsis muaj cov ntaub ntawv qauv siv byte-by-byte tsom xam ntawm ibd cov ntaub ntawv

prehistory

Nws thiaj li tshwm sim hais tias tus neeg rau zaub mov raug tawm tsam los ntawm tus kab mob ransomware, uas, los ntawm "muaj hmoo kev huam yuaj," ib feem ntawm cov ntaub ntawv .ibd (cov ntaub ntawv nyoos ntawm innodb cov ntxhuav) tsis tau kov, tab sis tib lub sij hawm tag nrho encrypted cov ntaub ntawv .fpm ( cov ntaub ntawv qauv). Hauv qhov no, .idb tuaj yeem muab faib ua:

  • raug kho dua tshiab los ntawm cov cuab yeej txheem thiab cov lus qhia. Rau cov xwm txheej zoo li no, muaj qhov zoo heev ua;
  • ib nrab encrypted ntxhuav. Feem ntau cov no yog cov rooj loj, uas (raws li kuv nkag siab) cov neeg tawm tsam tsis muaj RAM txaus rau tag nrho encryption;
  • Zoo, tag nrho cov lus encrypted uas tsis tuaj yeem rov qab los.

Nws muaj peev xwm txiav txim siab seb qhov kev xaiv twg cov ntxhuav yog los ntawm tsuas yog qhib nws hauv cov ntawv nyeem hauv qab qhov xav tau encoding (hauv kuv rooj plaub nws yog UTF8) thiab tsuas yog saib cov ntaub ntawv rau qhov muaj cov ntawv nyeem, piv txwv li:

Rov qab cov ntaub ntawv los ntawm XtraDB cov lus tsis muaj cov ntaub ntawv qauv siv byte-by-byte tsom xam ntawm ibd cov ntaub ntawv

Tsis tas li ntawd, thaum pib ntawm cov ntaub ntawv koj tuaj yeem soj ntsuam ntau tus lej ntawm 0 bytes, thiab cov kab mob uas siv cov block encryption algorithm (feem ntau) feem ntau cuam tshuam rau lawv thiab.
Rov qab cov ntaub ntawv los ntawm XtraDB cov lus tsis muaj cov ntaub ntawv qauv siv byte-by-byte tsom xam ntawm ibd cov ntaub ntawv

Hauv kuv qhov xwm txheej, cov neeg tawm tsam tawm ntawm txoj hlua 4-byte (1, 0, 0, 0) thaum kawg ntawm txhua cov ntaub ntawv encrypted, uas ua kom yooj yim txoj haujlwm. Txhawm rau tshawb nrhiav cov ntaub ntawv tsis muaj kab mob, tsab ntawv txaus:

def opened(path):
    files = os.listdir(path)
    for f in files:
        if os.path.isfile(path + f):
            yield path + f

for full_path in opened("C:somepath"):
    file = open(full_path, "rb")
    last_string = ""
    for line in file:
        last_string = line
        file.close()
    if (last_string[len(last_string) -4:len(last_string)]) != (1, 0, 0, 0):
        print(full_path)

Yog li, nws tig tawm mus nrhiav cov ntaub ntawv teej tug mus rau thawj hom. Qhov thib ob suav nrog ntau phau ntawv ua haujlwm, tab sis qhov pom tau txaus lawm. Txhua yam yuav zoo, tab sis koj yuav tsum paub kiag li qauv thiab (tau kawg) ib rooj plaub tau tshwm sim uas kuv yuav tsum tau ua haujlwm nrog lub rooj hloov pauv ntau zaus. Tsis muaj leej twg nco qab seb hom teb puas tau hloov lossis ib kab tshiab tau ntxiv.

Wilds City, hmoov tsis, tsis tuaj yeem pab nrog cov xwm txheej zoo li no, uas yog vim li cas tsab xov xwm no raug sau.

Tau mus rau qhov taw tes

Muaj cov qauv ntawm ib lub rooj los ntawm 3 lub hlis dhau los uas tsis sib haum nrog tam sim no (tej zaum ib daim teb, thiab tejzaum nws ntxiv). Rooj qauv:

CREATE TABLE `table_1` (
    `id` INT (11),
    `date` DATETIME ,
    `description` TEXT ,
    `id_point` INT (11),
    `id_user` INT (11),
    `date_start` DATETIME ,
    `date_finish` DATETIME ,
    `photo` INT (1),
    `id_client` INT (11),
    `status` INT (1),
    `lead__time` TIME ,
    `sendstatus` TINYINT (4)
); 

Hauv qhov no, koj yuav tsum rho tawm:

  • id_point ib(11);
  • id_user ib(11);
  • date_start LUB SIJ HAWM;
  • date_finish DATETIME.

Rau kev rov qab los, kev tsom xam byte-by-byte ntawm cov ntaub ntawv .ibd yog siv, tom qab ntawd hloov lawv mus rau hauv daim ntawv nyeem tau ntau dua. Txij li thaum nrhiav tau qhov peb xav tau, peb tsuas yog yuav tsum tau txheeb xyuas cov ntaub ntawv xws li int thiab datatime, tsab xov xwm yuav piav qhia tsuas yog lawv, tab sis qee zaum peb kuj tseem yuav xa mus rau lwm hom ntaub ntawv, uas tuaj yeem pab rau lwm yam xwm txheej zoo sib xws.

Teeb meem 1: teb nrog hom DATETIME thiab TEXT muaj NULL qhov tseem ceeb, thiab lawv tsuas yog hla hauv cov ntaub ntawv, vim qhov no, nws tsis tuaj yeem txiav txim siab cov qauv los kho hauv kuv rooj plaub. Nyob rau hauv cov kab tshiab, lub neej ntawd tus nqi yog null, thiab ib feem ntawm qhov kev sib pauv tuaj yeem poob vim qhov teeb tsa innodb_flush_log_at_trx_commit = 0, yog li lub sijhawm ntxiv yuav tsum tau siv los txiav txim cov qauv.

Teeb meem 2: nws yuav tsum tau muab coj mus rau hauv tus account tias kab deleted ntawm DELETE yuav tag nrho nyob rau hauv ibd cov ntaub ntawv, tab sis nrog ALTER TABLE lawv cov qauv yuav tsis muab kho dua. Yog li ntawd, cov qauv ntaub ntawv tuaj yeem sib txawv ntawm qhov pib ntawm cov ntaub ntawv mus rau nws qhov kawg. Yog tias koj nquag siv OPTIMIZE TABLE, ces koj tsis zoo li yuav ntsib teeb meem zoo li no.

Tshem nyiaj, DBMS version cuam tshuam rau txoj kev khaws cov ntaub ntawv, thiab qhov piv txwv no yuav tsis ua haujlwm rau lwm cov ntawv loj. Hauv kuv rooj plaub, lub qhov rais version ntawm mariadb 10.1.24 tau siv. Tsis tas li, txawm hais tias hauv mariadb koj ua haujlwm nrog InnoDB cov lus, qhov tseeb lawv yog XtraDB, uas tsis suav nrog kev siv ntawm txoj kev nrog InnoDB mysql.

Cov ntaub ntawv tsom xam

Hauv python, hom ntaub ntawv bytes() qhia cov ntaub ntawv Unicode rau qhov chaw ntawm cov lej tsis tu ncua. Txawm hais tias koj tuaj yeem saib cov ntaub ntawv hauv daim ntawv no, kom yooj yim koj tuaj yeem hloov cov bytes rau hauv cov lej los ntawm kev hloov cov byte array rau hauv ib qho array tsis tu ncua (cov npe (example_byte_array)). Nyob rau hauv txhua rooj plaub, ob txoj kev yog tsim nyog rau kev soj ntsuam.

Tom qab saib los ntawm ntau cov ntaub ntawv ibd, koj tuaj yeem pom cov hauv qab no:

Rov qab cov ntaub ntawv los ntawm XtraDB cov lus tsis muaj cov ntaub ntawv qauv siv byte-by-byte tsom xam ntawm ibd cov ntaub ntawv

Ntxiv mus, yog tias koj faib cov ntaub ntawv los ntawm cov ntsiab lus no, koj yuav tau txais cov ntaub ntawv feem ntau. Peb yuav siv infimum ua tus divisor.

table = table.split("infimum".encode())

Ib qho kev pom zoo: rau cov ntxhuav nrog cov ntaub ntawv me me, nruab nrab ntawm qhov tsis zoo thiab qhov siab tshaj plaws muaj qhov taw qhia rau cov kab hauv qhov thaiv.

Rov qab cov ntaub ntawv los ntawm XtraDB cov lus tsis muaj cov ntaub ntawv qauv siv byte-by-byte tsom xam ntawm ibd cov ntaub ntawv - xeem lub rooj nrog 1 kab

Rov qab cov ntaub ntawv los ntawm XtraDB cov lus tsis muaj cov ntaub ntawv qauv siv byte-by-byte tsom xam ntawm ibd cov ntaub ntawv - kuaj lub rooj nrog 2 kab

Kab array rooj [0] tuaj yeem hla. Tom qab saib los ntawm nws, kuv tseem nrhiav tsis tau cov ntaub ntawv nyoos. Feem ntau, qhov thaiv no yog siv los khaws cov indexes thiab cov yuam sij.
Pib nrog lub rooj [1] thiab txhais nws mus rau hauv cov lej array, koj tuaj yeem pom qee cov qauv, uas yog:

Rov qab cov ntaub ntawv los ntawm XtraDB cov lus tsis muaj cov ntaub ntawv qauv siv byte-by-byte tsom xam ntawm ibd cov ntaub ntawv

Cov no yog cov nqi int khaws cia hauv ib txoj hlua. Thawj byte qhia seb tus lej puas zoo lossis tsis zoo. Hauv kuv qhov xwm txheej, txhua tus lej yog qhov zoo. Los ntawm qhov seem 3 bytes, koj tuaj yeem txiav txim siab tus lej siv cov haujlwm hauv qab no. Tsab ntawv:

def find_int(val: str):  # example '128, 1, 2, 3'
    val = [int(v) for v in  val.split(", ")]
    result_int = val[1]*256**2 + val[2]*256*1 + val[3]
    return result_int

Piv txwv li, 128, 0, 0, 1 = 1, lossis 128, 0, 75, 108 = 19308.
Lub rooj muaj lub ntsiab tseem ceeb nrog nws pib nce, thiab nws kuj tuaj yeem pom ntawm no

Rov qab cov ntaub ntawv los ntawm XtraDB cov lus tsis muaj cov ntaub ntawv qauv siv byte-by-byte tsom xam ntawm ibd cov ntaub ntawv

Tau muab piv cov ntaub ntawv los ntawm cov rooj sib tw, nws tau tshaj tawm tias DATETIME khoom muaj 5 bytes thiab pib nrog 153 (feem ntau yuav qhia tau ib xyoos ib zaug). Txij li thaum DATTIME ntau yog '1000-01-01' rau '9999-12-31', Kuv xav tias tus lej ntawm bytes yuav txawv, tab sis hauv kuv qhov xwm txheej, cov ntaub ntawv poob rau lub sijhawm txij li xyoo 2016 txog 2019, yog li peb yuav xav tias tias 5 bytes txaus.

Txhawm rau txiav txim siab lub sijhawm tsis muaj vib nas this, cov haujlwm hauv qab no tau sau. Tsab ntawv:

day_ = lambda x: x % 64 // 2  # {x,x,X,x,x }

def hour_(x1, x2):  # {x,x,X1,X2,x}
    if x1 % 2 == 0:
        return x2 // 16
    elif x1 % 2 == 1:
        return x2 // 16 + 16
    else:
        raise ValueError

min_ = lambda x1, x2: (x1 % 16) * 4 + (x2 // 64)  # {x,x,x,X1,X2}

Nws tsis tuaj yeem sau cov haujlwm ua haujlwm rau xyoo thiab hli, yog li kuv yuav tsum tau hack nws. Tsab ntawv:

ym_list = {'2016, 1': '153, 152, 64', '2016, 2': '153, 152, 128', 
           '2016, 3': '153, 152, 192', '2016, 4': '153, 153, 0',
           '2016, 5': '153, 153, 64', '2016, 6': '153, 153, 128', 
           '2016, 7': '153, 153, 192', '2016, 8': '153, 154, 0', 
           '2016, 9': '153, 154, 64', '2016, 10': '153, 154, 128', 
           '2016, 11': '153, 154, 192', '2016, 12': '153, 155, 0',
           '2017, 1': '153, 155, 128', '2017, 2': '153, 155, 192', 
           '2017, 3': '153, 156, 0', '2017, 4': '153, 156, 64',
           '2017, 5': '153, 156, 128', '2017, 6': '153, 156, 192',
           '2017, 7': '153, 157, 0', '2017, 8': '153, 157, 64',
           '2017, 9': '153, 157, 128', '2017, 10': '153, 157, 192', 
           '2017, 11': '153, 158, 0', '2017, 12': '153, 158, 64', 
           '2018, 1': '153, 158, 192', '2018, 2': '153, 159, 0',
           '2018, 3': '153, 159, 64', '2018, 4': '153, 159, 128', 
           '2018, 5': '153, 159, 192', '2018, 6': '153, 160, 0',
           '2018, 7': '153, 160, 64', '2018, 8': '153, 160, 128',
           '2018, 9': '153, 160, 192', '2018, 10': '153, 161, 0', 
           '2018, 11': '153, 161, 64', '2018, 12': '153, 161, 128',
           '2019, 1': '153, 162, 0', '2019, 2': '153, 162, 64', 
           '2019, 3': '153, 162, 128', '2019, 4': '153, 162, 192', 
           '2019, 5': '153, 163, 0', '2019, 6': '153, 163, 64',
           '2019, 7': '153, 163, 128', '2019, 8': '153, 163, 192',
           '2019, 9': '153, 164, 0', '2019, 10': '153, 164, 64', 
           '2019, 11': '153, 164, 128', '2019, 12': '153, 164, 192',
           '2020, 1': '153, 165, 64', '2020, 2': '153, 165, 128',
           '2020, 3': '153, 165, 192','2020, 4': '153, 166, 0', 
           '2020, 5': '153, 166, 64', '2020, 6': '153, 1, 128',
           '2020, 7': '153, 166, 192', '2020, 8': '153, 167, 0', 
           '2020, 9': '153, 167, 64','2020, 10': '153, 167, 128',
           '2020, 11': '153, 167, 192', '2020, 12': '153, 168, 0'}

def year_month(x1, x2):  # {x,X,X,x,x }

    for key, value in ym_list.items():
        key = [int(k) for k in key.replace("'", "").split(", ")]
        value = [int(v) for v in value.split(", ")]
        if x1 == value[1] and x2 // 64 == value[2] // 64:
            return key
    return 0, 0

Kuv paub tseeb tias yog tias koj siv sijhawm ntau, qhov kev nkag siab yuam kev no tuaj yeem kho tau.
Tom ntej no, muaj nuj nqi uas xa rov qab hnub tim khoom los ntawm ib txoj hlua. Tsab ntawv:

def find_data_time(val:str):
    val = [int(v) for v in val.split(", ")]
    day = day_(val[2])
    hour = hour_(val[2], val[3])
    minutes = min_(val[3], val[4])
    year, month = year_month(val[1], val[2])
    return datetime(year, month, day, hour, minutes)

Tswj xyuas cov nqi pheej rov qab los ntawm int, int, datetime, datetime Rov qab cov ntaub ntawv los ntawm XtraDB cov lus tsis muaj cov ntaub ntawv qauv siv byte-by-byte tsom xam ntawm ibd cov ntaub ntawv, nws zoo li qhov no yog qhov koj xav tau. Ntxiv mus, xws li ib ntus tsis rov ua dua ob zaug ib kab.

Siv cov lus qhia tsis tu ncua, peb pom cov ntaub ntawv tsim nyog:

fined = re.findall(r'128, d*, d*, d*, 128, d*, d*, d*, 153, 1[6,5,4,3]d, d*, d*, d*, 153, 1[6,5,4,3]d, d*, d*, d*', int_array)

Thov nco ntsoov tias thaum tshawb nrhiav siv cov lus qhia no, nws yuav tsis tuaj yeem txiav txim siab NULL qhov tseem ceeb hauv cov teb uas xav tau, tab sis hauv kuv qhov teeb meem no tsis yog qhov tseem ceeb. Tom qab ntawd peb mus dhau qhov peb pom hauv lub voj voog. Tsab ntawv:

result = []
for val in fined:
    pre_result = []
    bd_int  = re.findall(r"128, d*, d*, d*", val)
    bd_date= re.findall(r"(153, 1[6,5,4,3]d, d*, d*, d*)", val)
    for it in bd_int:
        pre_result.append(find_int(bd_int[it]))
    for bd in bd_date:
        pre_result.append(find_data_time(bd))
    result.append(pre_result)

Qhov tseeb, qhov ntawd yog txhua yam, cov ntaub ntawv los ntawm cov txiaj ntsig array yog cov ntaub ntawv peb xav tau. ###PS.###
Kuv nkag siab tias txoj kev no tsis haum rau txhua tus, tab sis lub hom phiaj tseem ceeb ntawm tsab xov xwm yog kom ua sai sai es tsis daws koj cov teeb meem. Kuv xav tias qhov kev daws teeb meem zoo tshaj plaws yuav yog pib kawm qhov chaws ntawm koj tus kheej mariadb, tab sis vim muaj sij hawm tsawg, txoj kev tam sim no zoo li ceev tshaj plaws.

Qee qhov xwm txheej, tom qab kev txheeb xyuas cov ntaub ntawv, koj tuaj yeem txiav txim siab txog cov qauv thiab rov qab los siv ib qho ntawm cov qauv txheej txheem los ntawm cov kev sib txuas saum toj no. Qhov no yuav raug ntau dua thiab ua rau muaj teeb meem tsawg dua.

Tau qhov twg los: www.hab.com

Ntxiv ib saib