M'ntchito yanga, nthawi zambiri ndimapeza njira zatsopano zamakina / mapulogalamu, zomwe zimasoweka pa intaneti yolankhula Chirasha. Ndi nkhaniyi, ndiyesera kudzaza kusiyana kotereku ndi chitsanzo kuchokera m'zochita zanga zaposachedwa, pamene ndinafunika kukhazikitsa zochitika za CDC kuchokera ku DBMS ziwiri zodziwika (PostgreSQL ndi MongoDB) ku gulu la Kafka pogwiritsa ntchito Debezium. Ndikukhulupirira kuti nkhani yowunikirayi, yomwe idawoneka chifukwa cha ntchito yomwe yachitika, ikhala yothandiza kwa ena.
Kodi Debezium ndi CDC ndi chiyani?
Debezium - Woimira gulu la mapulogalamu a CDC (Jambulani kusintha kwa data), kapena ndendende, ndi seti yolumikizira ma DBMS osiyanasiyana omwe amagwirizana ndi dongosolo la Apache Kafka Connect.
izi Open source project, zololedwa pansi pa Apache License v2.0 ndipo mothandizidwa ndi Red Hat. Chitukuko chakhala chikuchitika kuyambira 2016 ndipo pakadali pano chimapereka chithandizo chovomerezeka ku DBMS zotsatirazi: MySQL, PostgreSQL, MongoDB, SQL Server. Palinso zolumikizira za Cassandra ndi Oracle, koma pakadali pano zili "m'malo ofikira", ndipo zatsopano sizikutsimikizira kuti zimagwirizana.
Ngati tifanizira CDC ndi njira yachikhalidwe (pamene ntchitoyo ikuwerengera deta kuchokera ku DBMS mwachindunji), ndiye kuti ubwino wake waukulu umaphatikizapo kukhazikitsidwa kwa kusintha kwa deta pamlingo wa mzere ndi latency yochepa, kudalirika kwakukulu ndi kupezeka. Mfundo ziwiri zomaliza zimapezedwa pogwiritsa ntchito gulu la Kafka ngati malo osungiramo zochitika za CDC.
M'malo mwake, zinthu ndi zosiyana: kudzaza Data Lake (ulalo womaliza pachithunzi pamwambapa) si njira yokhayo yogwiritsira ntchito Debezium. Zochitika zotumizidwa ku Apache Kafka zitha kugwiritsidwa ntchito ndi mapulogalamu anu kuthana ndi zochitika zosiyanasiyana. Mwachitsanzo:
kuchotsa deta yosafunika kuchokera ku cache;
kutumiza zidziwitso;
zosintha zakusaka;
mtundu wina wa zolemba zowerengera;
...
Ngati muli ndi pulogalamu ya Java ndipo palibe chifukwa / mwayi wogwiritsa ntchito gulu la Kafka, palinso mwayi woti mugwiritse ntchito. cholumikizira chophatikizidwa. Chodziwika bwino ndi chakuti mukhoza kukana zowonjezera zowonjezera (monga cholumikizira ndi Kafka). Komabe, yankho ili silinagwiritsidwe ntchito kuyambira mtundu 1.1 ndipo silikulimbikitsidwanso kuti ligwiritsidwe ntchito (litha kuchotsedwa m'mabuku amtsogolo).
Nkhaniyi ifotokoza za zomangamanga zomwe zimalimbikitsidwa ndi omanga, zomwe zimapereka kulolerana kwa zolakwika ndi scalability.
Pano ndi pambuyo pake m'nkhaniyi, zitsanzo zonse zosinthika zimaganiziridwa pazithunzi za chithunzi cha Docker chogawidwa ndi opanga Debezium. Ili ndi mafayilo onse ofunikira (zolumikizira) ndipo imapereka kasinthidwe ka Kafka Connect pogwiritsa ntchito zosintha zachilengedwe.
Ngati mukufuna kugwiritsa ntchito Kafka Connect kuchokera ku Confluent, muyenera kuwonjezera mapulagini a zolumikizira zofunika nokha ku chikwatu chomwe chafotokozedwa mu. plugin.path kapena kukhazikitsidwa kudzera pakusintha kwachilengedwe CLASSPATH. Zokonda za Kafka Connect wogwira ntchito ndi zolumikizira zimatanthauzidwa kudzera pamafayilo osintha omwe amaperekedwa ngati zotsutsana ndi lamulo loyambira wogwira ntchito. Zambiri onani zolemba.
Mwachikhazikitso, Debezium imalemba deta mumtundu wa JSON, womwe umavomerezeka ku mabokosi a mchenga ndi deta yaying'ono, koma ikhoza kukhala vuto m'mabuku odzaza kwambiri. Njira ina yosinthira JSON ndikusindikiza mauthenga pogwiritsa ntchito yuro ku mtundu wa binary, womwe umachepetsa katundu pa I / O subsystem ku Apache Kafka.
Mfundo ya ntchito cholumikizira pambuyo kasinthidwe ndi yosavuta:
Pachiyambi choyamba, imagwirizanitsa ndi database yomwe yatchulidwa mu kasinthidwe ndikuyamba mumayendedwe chithunzithunzi choyambirira, kutumiza ku Kafka seti yoyamba ya data yomwe idalandiridwa ndi zovomerezeka SELECT * FROM table_name.
Kukhazikitsa kukamalizidwa, cholumikizira chimalowetsamo momwe mungawerengere zosintha kuchokera ku mafayilo a PostgreSQL WAL.
Za zomwe mungagwiritse ntchito:
name - dzina la cholumikizira chomwe makonzedwe omwe afotokozedwa pansipa akugwiritsidwa ntchito; m'tsogolomu, dzinali limagwiritsidwa ntchito pogwira ntchito ndi chojambulira (i.e. yang'anani chikhalidwe / kuyambitsanso / kusintha kasinthidwe) kudzera mu Kafka Connect REST API;
connector.class - kalasi yolumikizira ya DBMS yomwe idzagwiritsidwa ntchito ndi cholumikizira chokhazikika;
plugin.name ndi dzina la pulogalamu yowonjezera yosinthira deta kuchokera ku mafayilo a WAL. Lilipo kuti musankhe wal2json, decoderbuffs ΠΈ pgoutput. Zoyamba ziwiri zimafuna kukhazikitsa zowonjezera zoyenera mu DBMS, ndi pgoutput pa mtundu wa PostgreSQL 10 ndi apamwamba safuna kusintha zina;
database.* - zosankha zolumikizira ku database, komwe database.server.name - dzina la chitsanzo cha PostgreSQL chomwe chimagwiritsidwa ntchito kupanga dzina la mutuwo mgulu la Kafka;
table.include.list - mndandanda wa matebulo omwe tikufuna kutsatira zosintha; kuperekedwa mu mawonekedwe schema.table_name; sungagwiritsidwe ntchito limodzi ndi table.exclude.list;
Izi, zidzapangitsa kuti mafayilo a WAL "atseke" pa disk ndipo mwina atha kutha.
Ndipo apa zosankha zimabwera kudzapulumutsa. heartbeat.interval.ms ΠΈ heartbeat.action.query. Kugwiritsa ntchito njirazi mwa awiriawiri kumapangitsa kuti zitheke kuchita pempho losintha deta patebulo losiyana nthawi iliyonse uthenga wa kugunda kwamtima utumizidwa. Chifukwa chake, LSN yomwe cholumikizira chili pano (mugawo lobwereza) imasinthidwa pafupipafupi. Izi zimathandiza DBMS kuchotsa mafayilo a WAL omwe sakufunikanso. Kuti mudziwe zambiri za momwe zosankha zimagwirira ntchito, onani zolemba.
Njira ina yomwe imayenera kuyang'aniridwa kwambiri ndi transforms. Ngakhale ndizosavuta komanso kukongola ...
Mwachikhazikitso, Debezium imapanga mitu pogwiritsa ntchito mfundo zotsatirazi: serverName.schemaName.tableName. Izi sizingakhale zothandiza nthawi zonse. Zosankha transforms pogwiritsa ntchito mawu okhazikika, mutha kufotokozera mndandanda wa matebulo omwe zochitika zake ziyenera kutumizidwa kumutu wokhala ndi dzina linalake.
Mu kasinthidwe wathu zikomo transforms zotsatirazi zimachitika: zochitika zonse za CDC kuchokera kumalo osungira omwe amatsatiridwa zidzapita kumutu ndi dzina data.cdc.dbname. Kupanda kutero (popanda makonda awa), Debezium ikanapanga mutu patebulo lililonse la mawonekedwe: pg-dev.public.<table_name>.
Zolepheretsa zolumikizira
Pamapeto pa kufotokozera za kasinthidwe kolumikizira kwa PostgreSQL, ndikofunikira kuyankhula za izi / zoperewera za ntchito yake:
Magwiridwe a cholumikizira cha PostgreSQL amadalira lingaliro la decoding yomveka. Choncho iye sichitsata zopempha zosintha kamangidwe ka nkhokwe (DDL) - molingana, izi sizidzakhala pamitu.
Popeza mipata yobwerezabwereza imagwiritsidwa ntchito, kulumikizana kwa cholumikizira ndikotheka okha kwa master DBMS chitsanzo.
Ngati wogwiritsa ntchito pomwe cholumikizira chimalumikizidwa ndi nkhokwe ali ndi ufulu wowerengera okha, ndiye kuti musanayambe kukhazikitsidwa koyamba, muyenera kupanga pamanja polowera ndikusindikiza ku database.
Kugwiritsa ntchito kasinthidwe
Chifukwa chake tiyeni tikweze kasinthidwe kwathu mu cholumikizira:
M'zochitika zonsezi, zolembazo zimakhala ndi fungulo (PK) la zolemba zomwe zinasinthidwa, komanso zenizeni za kusintha: zomwe mbiriyo inalipo kale ndi zomwe zinakhala pambuyo pake.
Pankhani ya INSERT: mtengo kale (before) zofanana nullkutsatiridwa ndi chingwe chomwe chinayikidwa.
Pankhani ya UPDATE: at payload.before mawonekedwe am'mbuyo a mzere akuwonetsedwa, ndi mkati payload.after - zatsopano ndi chiyambi cha kusintha.
2.2 MongoDB
Cholumikizira ichi chimagwiritsa ntchito njira yobwerezabwereza ya MongoDB, kuwerenga zambiri kuchokera ku oplog ya node yoyamba ya DBMS.
Mofananamo ndi chojambulira chomwe chafotokozedwa kale cha PgSQL, apanso, poyambira koyamba, chithunzithunzi choyambirira cha deta chimatengedwa, pambuyo pake cholumikizira chimasinthira ku oplog kuwerenga mode.
Monga mukuonera, palibe zosankha zatsopano poyerekeza ndi chitsanzo chapitachi, koma chiwerengero chokha cha zosankha zomwe zimagwirizanitsa ndi database ndi prefixes zawo zachepetsedwa.
Makhalidwe transforms nthawiyi amachita izi: tembenuzani dzina la mutu womwe mukufuna kuchoka pa chiwembu <server_name>.<db_name>.<collection_name> Π² data.cdc.mongo_<db_name>.
kulekerera zolakwika
Nkhani yololera zolakwa ndi kupezeka kwakukulu mu nthawi yathu ndi yovuta kwambiri kuposa kale lonse - makamaka tikamalankhula za deta ndi zochitika, ndi kufufuza kusintha kwa deta sikuli pambali pa nkhaniyi. Tiyeni tiwone zomwe zingasokoneze mfundo ndi zomwe zidzachitikire Debezium pazochitika zilizonse.
Kutayika kwa kulumikizana ndi gulu la Kafka. Chojambuliracho chimangosiya kuwerenga pamalo omwe chinalephera kutumiza ku Kafka ndipo nthawi ndi nthawi yesetsani kutumizanso mpaka kuyesa kukwanitsa.
Kochokera deta palibe. Cholumikizira chidzayesa kulumikizanso ku gwero malinga ndi kasinthidwe. Zosasintha ndizoyesa 16 kugwiritsa ntchito exponential backback. Pambuyo pa kuyesa kwa 16 kulephera, ntchitoyi idzalembedwa ngati Inalephera ndipo iyenera kuyambiranso pamanja kudzera pa mawonekedwe a Kafka Connect REST.
Pankhani ya PostgreSQL deta sidzatayika, chifukwa kugwiritsa ntchito mipata yobwereza kudzalepheretsa kuchotsedwa kwa mafayilo a WAL osawerengedwa ndi cholumikizira. Pankhaniyi, pali zovuta: ngati kugwirizana kwa intaneti pakati pa cholumikizira ndi DBMS kumasokonekera kwa nthawi yayitali, pali mwayi woti danga la disk lidzatha, ndipo izi zingayambitse kulephera kwa DBMS yonse.
Pankhani ya MySQL mafayilo a binlog amatha kuzunguliridwa ndi DBMS yokha kulumikizidwa kusanabwezeretsedwe. Izi zipangitsa kuti cholumikizira chilowe m'malo olephera, ndipo chidzafunika kuyambiranso mumayendedwe ojambulitsa kuti mupitirize kuwerenga kuchokera ku ma binlogs kuti mubwezeretse ntchito yabwinobwino.
pa MongoDB. Zolembazo zimati: khalidwe la cholumikizira ngati mafayilo a log / oplog achotsedwa ndipo cholumikizira sichingapitirize kuwerenga kuchokera pamalo pomwe chinasiya ndichofanana ndi DBMS yonse. Zimakhala kuti cholumikizira chidzalowa m'boma Inalephera ndipo adzafunika kuyambiransoko mumalowedwe chithunzithunzi choyambirira.
Komabe, pali zosiyana. Ngati cholumikizira chinali chosalumikizidwa kwa nthawi yayitali (kapena sichinafike pamwambo wa MongoDB), ndipo oplog idazunguliridwa panthawiyi, ndiye kuti kugwirizanako kubwezeretsedwa, cholumikizira chidzapitirizabe kuwerenga deta kuchokera pamalo oyamba omwe alipo. , ndichifukwa chake zina mwa data ku Kafka osati idzagunda.
Pomaliza
Debezium ndichidziwitso changa choyamba ndi machitidwe a CDC ndipo yakhala yabwino kwambiri. Pulojekitiyi idapereka chiphuphu ku chithandizo cha DBMS yayikulu, kumasuka kwa kasinthidwe, chithandizo chamagulu ndi gulu logwira ntchito. Kwa omwe ali ndi chidwi chochita, ndikupangira kuti muwerenge malangizowo Kafka Connect ΠΈ Debezium.
Poyerekeza ndi cholumikizira cha JDBC cha Kafka Connect, mwayi waukulu wa Debezium ndikuti zosintha zimawerengedwa kuchokera ku zipika za DBMS, zomwe zimalola kuti deta ilandilidwe ndikuchedwa pang'ono. JDBC Connector (yoperekedwa ndi Kafka Connect) imafunsa tebulo lotsatiridwa panthawi yokhazikika ndipo (pazifukwa zomwezo) sizimapanga mauthenga pamene deta yachotsedwa (mungafunse bwanji deta yomwe palibe?).