Kugwiritsa ntchito ma code otsika pamapulatifomu owunikira

Okondedwa owerenga, tsiku labwino!

Ntchito yomanga nsanja za IT zosonkhanitsira ndikusanthula deta posachedwa kapena mtsogolo zimadza kwa kampani iliyonse yomwe bizinesi yake imachokera pamwambo wopereka chithandizo chanzeru kapena kupanga zinthu zovuta mwaukadaulo. Kumanga nsanja zowunikira ndi ntchito yovuta komanso yowononga nthawi. Komabe, ntchito iliyonse ikhoza kukhala yosavuta. M'nkhaniyi ndikufuna kugawana zomwe ndakumana nazo pakugwiritsa ntchito zida zotsika kuti zithandizire kupanga mayankho owunikira. Izi zidapezedwa pakukhazikitsa ma projekiti angapo munjira ya Big Data Solutions ya kampani ya Neoflex. Kuchokera ku 2005, malangizo a Big Data Solutions a Neoflex akhala akulimbana ndi nkhani zomanga malo osungiramo deta ndi nyanja, kuthetsa mavuto opititsa patsogolo kuthamanga kwa chidziwitso ndikugwira ntchito pa njira yoyendetsera khalidwe la deta.

Kugwiritsa ntchito ma code otsika pamapulatifomu owunikira

Palibe amene adzatha kupewa kusonkhanitsa deta yofooka komanso/kapena yosanjidwa mwamphamvu. Mwina ngakhale tikukamba za mabizinesi ang'onoang'ono. Kupatula apo, pokulitsa bizinesi, wochita bizinesi wodalirika adzayang'anizana ndi nkhani zopanga pulogalamu yokhulupirika, adzafuna kusanthula momwe zinthu zogulitsira zimagwirira ntchito, aziganiza zotsatsa zomwe akufuna, ndipo adzadabwitsidwa ndi kufunikira kwa zinthu zomwe zikutsagana nawo. . Kuyerekeza koyamba, vutoli likhoza kuthetsedwa "pa bondo". Koma pamene bizinesi ikukula, kubwera ku nsanja yowunikira sikungapeweke.

Komabe, ndi nkhani ziti zomwe ntchito zowunikira deta zimatha kukhala zovuta zamagulu a "Rocket Science"? Mwina panthawi yomwe tikukamba za deta yaikulu kwambiri.
Kuti Rocket Science ikhale yosavuta, mutha kudya njovuyo chidutswa ndi chidutswa.

Kugwiritsa ntchito ma code otsika pamapulatifomu owunikira

Mukamagwiritsa ntchito mopanda malire komanso modziyimira pawokha ntchito / ntchito / ma microservices anu, zimakhala zosavuta kuti inu, anzanu ndi bizinesi yonse idye njovu.

Pafupifupi makasitomala athu onse adabwera kudzalemba izi, atamanganso malo potengera machitidwe aukadaulo a magulu a DevOps.

Koma ngakhale ndi zakudya "zosiyana, njovu", timakhala ndi mwayi wochuluka wa "oversaturation" wa IT landscape. Panthawiyi ndikofunikira kuyimitsa, kutulutsa mpweya ndikuyang'ana kumbali nsanja yaumisiri yotsika.

Madivelopa ambiri amachita mantha ndi chiyembekezo chakufa pantchito yawo akachoka polemba code mwachindunji kupita ku "kukoka" mivi mu UI yolumikizira makina otsika. Koma kubwera kwa zida zamakina sikunabweretse kutha kwa mainjiniya, koma kubweretsa ntchito yawo pamlingo wina!

Tiyeni tione chifukwa chake.

Kusanthula kwa data pazantchito, makampani a telecom, kafukufuku wama media, gawo lazachuma nthawi zonse kumalumikizidwa ndi mafunso awa:

  • Kuthamanga kwachangu kusanthula;
  • Kutha kuchita zoyeserera popanda kukhudza kutulutsa kwakukulu kwa data;
  • Kudalirika kwa deta yokonzedwa;
  • Kusintha kalondolondo ndi kusintha;
  • Chitsimikizo cha data, mzere wa data, CDC;
  • Kutumiza mwachangu kwa zinthu zatsopano kumalo opangira;
  • Ndipo chodziwika bwino: mtengo wa chitukuko ndi chithandizo.

Ndiko kuti, mainjiniya ali ndi ntchito zambiri zapamwamba, zomwe zimatha kumalizidwa bwino mokwanira poyeretsa kuzindikira kwawo ntchito zachitukuko chochepa.

Zofunikira kuti otukula asamukire kumlingo watsopano zinali kusinthika ndikusintha kwa digito. Phindu la wopanga mapulogalamuwo likusinthanso: pali kuchepa kwakukulu kwa omanga omwe angadzilowetse m'malingaliro abizinesi kukhala makina.

Tiyeni tijambule fanizo ndi zilankhulo zotsika komanso zapamwamba. Kusintha kuchokera ku zilankhulo zotsika kupita ku zapamwamba ndikusintha kuchoka pa kulemba "malangizo achindunji m'chinenero cha hardware" kupita ku "malangizo m'chinenero cha anthu". Ndiko kuti, kuwonjezera wosanjikiza wa abstraction. Pachifukwa ichi, kusintha kwa mapulaneti otsika kuchokera ku zilankhulo zapamwamba za mapulogalamu ndikusintha kuchoka ku "malangizo m'chinenero cha anthu" kupita ku "malangizo m'chinenero cha bizinesi." Ngati pali otukula omwe amakhumudwa ndi izi, ndiye kuti akhala achisoni, mwina, kuyambira pomwe Java Script idabadwa, yomwe imagwiritsa ntchito masanjidwe osiyanasiyana. Ndipo ntchito izi, ndithudi, zimakhala ndi mapulogalamu a mapulogalamu pansi pa hood ndi njira zina za mapulogalamu apamwamba omwewo.

Chifukwa chake, ma code otsika ndikungowoneka kwa gawo lina lachidule.

Zomwe zimagwiritsidwa ntchito pogwiritsa ntchito ma code otsika

Mutu wa code-otsika ndi wotakata, koma tsopano ndikufuna kunena za momwe angagwiritsire ntchito "malingaliro otsika" pogwiritsa ntchito chitsanzo cha imodzi mwa ntchito zathu.

Gawo la Big Data Solutions la Neoflex limagwira ntchito kwambiri pazachuma zamabizinesi, kumanga malo osungiramo data ndi nyanja ndikudzipangira malipoti osiyanasiyana. Mu niche iyi, kugwiritsa ntchito ma code otsika kwakhala nthawi yayitali. Mwa zida zina zotsika, titha kutchula zida zokonzekera njira za ETL: Informatica Power Center, IBM Datastage, Pentaho Data Integration. Kapena Oracle Apex, yomwe imagwira ntchito ngati malo otukuka mwachangu malo olumikizirana kuti mupeze ndikusintha deta. Komabe, kugwiritsa ntchito zida zachitukuko chochepa sikumaphatikizapo kumanga mapulogalamu omwe akukhudzidwa kwambiri pazitsulo zamakono zamalonda ndi kudalira momveka bwino kwa wogulitsa.

Pogwiritsa ntchito mapulaneti otsika, mungathenso kukonza kayendetsedwe ka kayendetsedwe ka deta, kupanga mapulaneti a sayansi ya deta kapena, mwachitsanzo, ma modules kuti muwone khalidwe la deta.

Chimodzi mwa zitsanzo zogwiritsidwa ntchito pakugwiritsa ntchito zida zopangira ma code otsika ndi mgwirizano pakati pa Neoflex ndi Mediascope, mmodzi mwa atsogoleri mumsika wofufuza zaku Russia. Chimodzi mwa zolinga zamalonda za kampaniyi ndi kupanga deta pamaziko omwe otsatsa, nsanja za intaneti, ma TV, mawailesi, mabungwe otsatsa malonda ndi malonda amapanga zisankho zokhudzana ndi kugula malonda ndikukonzekera malonda awo.

Kugwiritsa ntchito ma code otsika pamapulatifomu owunikira

Kafukufuku wama media ndi gawo lamabizinesi odzaza ndiukadaulo. Kuzindikira kutsatizana kwamavidiyo, kusonkhanitsa deta kuchokera kuzipangizo zomwe zimasanthula zowonera, kuyeza zochitika pazamasamba - zonsezi zikutanthauza kuti kampaniyo ili ndi antchito ambiri a IT komanso chidziwitso chambiri pakupanga mayankho owunikira. Koma kukula kwachulukidwe mu kuchuluka kwa zidziwitso, kuchuluka ndi mitundu yosiyanasiyana ya magwero ake kukakamiza makampani a data a IT kupita patsogolo mosalekeza. Yankho losavuta kwambiri pakukulitsa nsanja yowunikira ya Mediascope yomwe ikugwira ntchito kale ingakhale kukulitsa ogwira ntchito ku IT. Koma njira yothandiza kwambiri ndiyo kufulumizitsa ntchito yachitukuko. Chimodzi mwamasitepe otsogolera mbali iyi chikhoza kukhala kugwiritsa ntchito nsanja zotsika.

Pa nthawi yomwe polojekitiyi idayamba, kampaniyo inali kale ndi yankho logwira ntchito. Komabe, kukhazikitsidwa kwa yankho ku MSSQL sikunathe kukwaniritsa zoyembekeza pakukweza magwiridwe antchito ndikusunga mtengo wovomerezeka wachitukuko.

Ntchito yomwe inali patsogolo pathu inali yofunitsitsadi - Neoflex ndi Mediascope anayenera kupanga njira yothetsera mafakitale pasanathe chaka, malinga ndi kutulutsidwa kwa MVP mkati mwa kotala loyamba la tsiku loyamba.

Tekinoloje ya Hadoop idasankhidwa kukhala maziko omanga nsanja yatsopano ya data yozikidwa pamakompyuta otsika. HDFS yakhala muyezo wosungirako deta pogwiritsa ntchito mafayilo a parquet. Kuti mupeze deta yomwe ili papulatifomu, Hive idagwiritsidwa ntchito, momwe masitolo onse omwe alipo amaperekedwa ngati matebulo akunja. Kuyika deta muzosungirako kunagwiritsidwa ntchito pogwiritsa ntchito Kafka ndi Apache NiFi.

Chida cha Lowe-code pamalingaliro awa chidagwiritsidwa ntchito kukhathamiritsa ntchito yovuta kwambiri pomanga nsanja yowunikira - ntchito yowerengera deta.

Kugwiritsa ntchito ma code otsika pamapulatifomu owunikira

Chida chotsika cha Datagram chidasankhidwa ngati njira yayikulu yopangira mapu. Neoflex Datagram ndi chida chopangira masinthidwe ndi kayendedwe ka data.
Pogwiritsa ntchito chida ichi, mutha kuchita popanda kulemba Scala code pamanja. Khodi ya Scala imapangidwa yokha pogwiritsa ntchito njira ya Model Driven Architecture.

Ubwino wodziwikiratu wa njirayi ndikufulumizitsa njira yachitukuko. Komabe, kuwonjezera pa liwiro, palinso zotsatirazi:

  • Kuwona zomwe zili ndi dongosolo la magwero/olandira;
  • Kufufuza komwe kumachokera zinthu zomwe zikuyenda pamtundu uliwonse (mzera);
  • Kuchita pang'ono kwa zosintha ndikuwona zotsatira zapakati;
  • Kuwunikanso kachidindo koyambira ndikuwongolera musanayambe kuphedwa;
  • Kutsimikizira zosintha zokha;
  • Kutsitsa deta yokha 1 mu 1.

Cholepheretsa kulowa m'mayankho otsika kwambiri kuti apange masinthidwe ndichotsika kwambiri: wopanga mapulogalamu ayenera kudziwa SQL ndikukhala ndi chidziwitso chogwira ntchito ndi zida za ETL. Ndikoyenera kutchula kuti majenereta osinthira ma code si zida za ETL m'lingaliro lalikulu la mawuwo. Zida zotsika kwambiri sizingakhale ndi malo awoawo opangira ma code. Ndiko kuti, code yopangidwa idzachitidwa mu chilengedwe chomwe chinalipo pamagulu ngakhale musanayike njira yochepetsera. Ndipo ichi mwina ndi chowonjezera china cha karma yotsika. Popeza, mofanana ndi gulu laling'ono, gulu la "classic" likhoza kugwira ntchito zomwe zimagwira ntchito, mwachitsanzo, mu code yoyera ya Scala. Kubweretsa zosintha kuchokera kumagulu onsewa kuti apange kupanga kumakhala kosavuta komanso kopanda msoko.

Ndikoyenera kudziwa kuti kuwonjezera pa ma code otsika, palinso mayankho opanda ma code. Ndipo pakati pawo, izi ndi zinthu zosiyana. Low-code imalola wopanga mapulogalamu kuti asokoneze kwambiri ma code opangidwa. Pankhani ya Datagram, ndizotheka kuwona ndikusintha kachidindo ka Scala komwe kamapangidwa; palibe-code siyingapereke mwayi wotero. Kusiyanitsa kumeneku ndikofunika kwambiri osati kokha mwa kusinthasintha kwa yankho, komanso ponena za chitonthozo ndi chilimbikitso pa ntchito ya akatswiri a deta.

Zomangamanga zothetsera

Tiyeni tiyese kulingalira ndendende momwe chida chochepetsera chimathandizira kuthetsa vuto la kukhathamiritsa liwiro la kupanga magwiridwe antchito a kuwerengera deta. Choyamba, tiyeni tione kamangidwe kachitidwe kachitidwe. Chitsanzo pa nkhaniyi ndi chitsanzo chopanga deta cha kafukufuku wamagulu.

Kugwiritsa ntchito ma code otsika pamapulatifomu owunikira

Magwero a deta m'dera lathu ndi osiyana kwambiri komanso osiyanasiyana:

  • Mamita a anthu (ma TV mita) ndi zida zamapulogalamu ndi zida za hardware zomwe zimawerengera machitidwe a ogwiritsa ntchito kuchokera kwa omwe adayankha pawailesi yakanema - ndani, liti komanso njira ya TV yomwe idawonera kunyumba yomwe ikuchita nawo kafukufukuyu. Zomwe zaperekedwa ndi njira yowonera mawayilesi olumikizidwa ndi phukusi la media ndi media media. Zomwe zili pagawo lotsitsa mu Data Lake zitha kulemedwa ndi kuchuluka kwa anthu, geostratification, zone yanthawi ndi zidziwitso zina zofunika pakuwunika kuwonera kanema wawayilesi wazinthu zina zapawailesi. Miyezo yomwe yatengedwa ingagwiritsidwe ntchito kusanthula kapena kukonza zotsatsa, kuwunika zomwe omvera azichita ndi zomwe amakonda, ndikuphatikiza maukonde owulutsa;
  • Deta ikhoza kubwera kuchokera ku machitidwe owonetsetsa kuti azitha kuwulutsa pawailesi yakanema ndikuyesa kuyang'ana kwazinthu zamakanema pa intaneti;
  • Zida zoyezera pa intaneti, kuphatikiza ma mita apakati pa tsamba ndi ogwiritsa ntchito. Wopereka deta ku Data Lake akhoza kukhala msakatuli wofufuzira kafukufuku ndi pulogalamu ya foni yokhala ndi VPN yomangidwa.
  • Deta imathanso kubwera kuchokera kumasamba omwe amaphatikiza zotsatira zodzaza mafunso a pa intaneti ndi zotsatira za kuyankhulana patelefoni pakufufuza kwamakampani;
  • Kuchulukitsa kowonjezera kwa nyanja ya data kumatha kuchitika mwa kukopera zambiri kuchokera ku zipika zamakampani ogwirizana.

Kukhazikitsa monga kutsitsa kuchokera ku ma source system kupita kugawo loyambirira la data yaiwisi kumatha kukonzedwa m'njira zosiyanasiyana. Ngati ma code otsika agwiritsidwa ntchito pazinthu izi, kupanga zodziwikiratu zosungira zolembedwa motengera metadata ndizotheka. Pamenepa, palibe chifukwa chotsikira pamlingo wopangira gwero kuti mukwaniritse mapu. Kuti tigwiritse ntchito kutsitsa zokha, tifunika kukhazikitsa kulumikizana ndi gwero, kenako ndikutanthauzira mndandanda wazinthu zomwe zikuyenera kutsatidwa. Mawonekedwe a chikwatu mu HDFS adzapangidwa okha ndipo azigwirizana ndi kasungidwe ka data pamakina oyambira.

Komabe, pankhani ya polojekitiyi, tidasankha kuti tisagwiritse ntchito gawoli la nsanja yotsika chifukwa kampani ya Mediascope idayamba kale ntchito yopanga ntchito yofananira pogwiritsa ntchito kuphatikiza kwa Nifi + Kafka.

Ndikoyenera kusonyeza nthawi yomweyo kuti zidazi sizingasinthidwe, koma ndizowonjezera. Nifi ndi Kafka amatha kugwira ntchito mwachindunji (Nifi -> Kafka) komanso kumbuyo (Kafka -> Nifi). Kwa nsanja yofufuzira zofalitsa, mtundu woyamba wa mtolo unagwiritsidwa ntchito.

Kugwiritsa ntchito ma code otsika pamapulatifomu owunikira

Kwa ife, NayFi anafunika kukonza mitundu yosiyanasiyana ya data kuchokera kumakina oyambira ndikutumiza kwa broker wa Kafka. Pankhaniyi, mauthenga adatumizidwa kumutu wina wa Kafka pogwiritsa ntchito PublishKafka Nifi processors. Kuyimba ndi kukonza mapaipiwa kumachitika mu mawonekedwe owonekera. Chida cha Nifi ndi kugwiritsa ntchito kuphatikiza kwa Nifi + Kafka kungathenso kutchedwa njira yochepetsetsa yachitukuko, yomwe ili ndi chotchinga chochepa cholowera muukadaulo wa Big Data ndikufulumizitsa ntchito yopititsa patsogolo ntchito.

Gawo lotsatira pakukhazikitsidwa kwa projekitiyo linali kubweretsa tsatanetsatane wamtundu umodzi wa semantic layer. Ngati bungwe lili ndi mbiri yakale, kuwerengetsa kumachitidwa mogwirizana ndi gawo lomwe likufunsidwa. Ngati bungwe si mbiri, ndiye optionally zotheka mwina recalculate zonse zili mu chinthu, kapena kukana kwathunthu recalculate chinthu ichi (chifukwa chosowa kusintha). Panthawi imeneyi, makiyi amapangidwa kwa mabungwe onse. Makiyi amasungidwa muzowongolera za Hbase zomwe zimagwirizana ndi zinthu zazikuluzikulu, zomwe zimakhala ndi kulumikizana pakati pa makiyi omwe ali papulatifomu yowunikira ndi makiyi ochokera kumayendedwe oyambira. Kuphatikizana kwa mabungwe a atomiki kumayendera limodzi ndi kupindula ndi zotsatira za kuwerengera koyambirira kwa data yowunikira. Ndondomeko yowerengera deta inali Spark. Zomwe zafotokozedwa pakubweretsa deta ku semantiki imodzi zidakhazikitsidwanso potengera mapu kuchokera ku chida chotsika cha Datagram.

Zomangamanga zomwe mukufuna zimafuna kuti SQL ipeze data kwa ogwiritsa ntchito bizinesi. Mng'oma unagwiritsidwa ntchito posankha izi. Zinthu zimalembetsedwa mu Hive zokha mukatsegula njira ya "Registr Hive Table" pazida zotsika.

Kugwiritsa ntchito ma code otsika pamapulatifomu owunikira

Kuwerengera kuyenda kowongolera

Datagram ili ndi mawonekedwe opangira mapangidwe amayendedwe a ntchito. Mapu atha kukhazikitsidwa pogwiritsa ntchito dongosolo la Oozie. Mu mawonekedwe a stream developer, ndizotheka kupanga masinthidwe ofananira, otsatizana, kapena otengera kusintha kwa data. Pali chithandizo cha zolemba za zipolopolo ndi mapulogalamu a java. Ndikothekanso kugwiritsa ntchito seva ya Apache Livy. Apache Livy imagwiritsidwa ntchito kuyendetsa mapulogalamu mwachindunji kuchokera kumalo otukuka.

Ngati kampaniyo ili kale ndi oimba ake, ndizotheka kugwiritsa ntchito REST API kuyika mapu mumayendedwe omwe alipo. Mwachitsanzo, tidakhala ndi zokumana nazo zopambana zoyika mamapu ku Scala kukhala oyimba olembedwa mu PLSQL ndi Kotlin. REST API ya chida chotsika kwambiri chimaphatikizapo ntchito monga kupanga chaka chotheka kutengera mapangidwe a mapu, kuyitana mapu, kutchula mndandanda wa mapu, ndipo, ndithudi, kupititsa magawo ku URL kuti ayendetse mapu.

Pamodzi ndi Oozie, ndizotheka kukonza kuwerengera pogwiritsa ntchito Airflow. Mwina sindikhala nthawi yayitali poyerekeza pakati pa Oozie ndi Airflow, koma ndingonena kuti pazochitika za ntchito yofufuza zofalitsa nkhani, chisankhocho chinagwera m'malo mwa Airflow. Mfundo zazikuluzikulu nthawi ino zinali gulu logwira ntchito kwambiri lomwe likupanga malonda ndi mawonekedwe otukuka kwambiri + API.

Kuyenda kwa mpweya kulinso kwabwino chifukwa imagwiritsa ntchito Python yokondedwa pofotokoza njira zowerengera. Ndipo nthawi zambiri, palibe nsanja zambiri zotseguka zoyendetsera ntchito. Kukhazikitsa ndi kuyang'anira kachitidwe kachitidwe (kuphatikiza tchati cha Gantt) kumangowonjezera mfundo ku karma ya Airflow.

Mafayilo osinthidwira poyambitsa kupanga ma code-code solution asanduka spark-submit. Izi zinachitika pa zifukwa ziwiri. Choyamba, spark-submit imakupatsani mwayi woyendetsa fayilo ya mtsuko kuchokera ku console. Kachiwiri, imatha kukhala ndi zidziwitso zonse zofunika kukonza mayendedwe (zomwe zimapangitsa kuti zikhale zosavuta kulemba zolemba zomwe zimapanga Dag).
Chinthu chodziwika kwambiri pakuyenda kwa Airflow kwa ife chinali SparkSubmitOperator.

SparkSubmitOperator imakupatsani mwayi woyendetsa mitsuko - mapaketi a Datagram okhala ndi magawo omwe adapangidwa kale.

Ndikoyenera kutchula kuti ntchito iliyonse ya Airflow imayenda mu ulusi wina ndipo sadziwa chilichonse chokhudza ntchito zina. Chifukwa chake, kulumikizana pakati pa ntchito kumachitika pogwiritsa ntchito owongolera, monga DummyOperator kapena BranchPythonOperator.

Kuphatikizidwa pamodzi, kugwiritsa ntchito njira ya Datagram low-code solution pamodzi ndi kusinthika kwa mafayilo osinthika (kupanga Dag) kunapangitsa kuti pakhale kufulumira komanso kuphweka kwa njira yopangira maulendo otsegula deta.

Kuwerengera kwa chiwonetsero

Mwina gawo lodzaza mwanzeru kwambiri popanga deta yowunikira ndi sitepe yowonetsera zomangira. Pankhani ya imodzi mwamawerengedwe a data yamakampani ofufuza, pakadali pano, detayo imachepetsedwa kukhala kuwulutsa, kutengera kuwongolera kwa magawo anthawi ndikulumikizidwa ndi gridi yowulutsa. Ndizothekanso kusinthira pa netiweki yakuwulutsa kwanuko (nkhani zakumaloko ndi kutsatsa). Mwa zina, sitepe iyi imaphwanya nthawi zowonera mosalekeza zazinthu zapa media potengera kuwunika kwanthawi zowonera. Nthawi yomweyo, zowonera zimakhala "zolemera" kutengera zambiri za kufunikira kwawo (kuwerengera kwa chinthu chowongolera).

Kugwiritsa ntchito ma code otsika pamapulatifomu owunikira

Gawo lina pokonzekera zowonetsera ndikutsimikizira deta. Algorithm yotsimikizira imaphatikizapo kugwiritsa ntchito mitundu ingapo ya sayansi ya masamu. Komabe, kugwiritsa ntchito nsanja yocheperako kumakupatsani mwayi wothyola algorithm yovuta kukhala mapu angapo owerengeka owoneka bwino. Kujambula kulikonse kumagwira ntchito yopapatiza. Zotsatira zake, kusokoneza kwapakatikati, kudula mitengo ndikuwonetsa magawo okonzekera deta ndizotheka.

Zinaganiziridwa kusiyanitsa algorithm yotsimikizira m'magawo otsatirawa:

  • Kupanga kusinthika kwa kudalira kowonera pa intaneti pa TV m'chigawo chowonera ma network onse mderali kwa masiku 60.
  • Kuwerengera kwa zotsalira za ophunzira (zopatuka zamtengo weniweni kuchokera ku zomwe zidanenedweratu ndi mtundu wotsitsimutsa) pamagawo onse obwerera ndi tsiku lowerengedwa.
  • Kusankhidwa kwa mawiri awiri osagwirizana ndi dera, pomwe kuchuluka kwa ophunzira patsiku lomaliza kupitilira muyeso (zofotokozedwa ndi makonzedwe a opareshoni).
  • Kuwerengeranso zotsalira zokonzedwanso za ophunzira pamagulu osagwirizana ndi ma TV a m'chigawo cha TV kwa aliyense amene adawona ma netiweki m'derali, kudziwa zomwe woyankhayo adapereka (kuchuluka kwa kusintha kwa zotsalira za ophunzira) popatula kuwonera kwa woyankhayo pachitsanzo. .
  • Sakani anthu amene kuchotsedwa kwawo kumapangitsa kuti malipiro a ophunzira abwerere mwakale.

Chitsanzo chapamwambachi chikutsimikizira lingaliro lakuti katswiri wa deta ali kale kwambiri ndi malingaliro ake ... ayenera kubwerera.

Ndi chiyani chinanso chomwe ma code otsika angachite?

Kukula kwa kugwiritsa ntchito chida chotsika kwambiri cha batch ndi kusuntha kwa data popanda kufunika kolemba pamanja ma code ku Scala sikuthera pamenepo.

Kugwiritsa ntchito kachidindo kakang'ono pakukula kwa datalake kwakhala kale muyezo kwa ife. Titha kunena kuti mayankho otengera Hadoop stack amatsata njira yachitukuko ya ma DWHs apamwamba kutengera RDBMS. Zida zapansi pa stack ya Hadoop zimatha kuthetsa ntchito zonse zopangira deta komanso ntchito yomanga malo omaliza a BI. Komanso, ziyenera kuzindikirika kuti BI singatanthauze kokha kuyimira deta, komanso kusinthidwa kwawo ndi ogwiritsa ntchito malonda. Nthawi zambiri timagwiritsa ntchito izi pomanga nsanja zandalama.

Kugwiritsa ntchito ma code otsika pamapulatifomu owunikira

Mwa zina, kugwiritsa ntchito code-otsika komanso, makamaka, Datagram, n'zotheka kuthetsa vuto la kutsatira chiyambi cha zinthu mtsinje deta ndi atomiki mpaka minda payekha (mzera). Kuti muchite izi, chida chotsika kwambiri chimagwiritsa ntchito mawonekedwe ndi Apache Atlas ndi Cloudera Navigator. Kwenikweni, wopanga mapulogalamu amayenera kulembetsa mndandanda wazinthu mumtanthauzira mawu wa Atlas ndikuwonetsa zinthu zolembetsedwa pomanga mapu. Njira yotsatirira magwero a deta kapena kusanthula kudalira kwazinthu kumapulumutsa nthawi yambiri pakufunika kukonza ma algorithms owerengera. Mwachitsanzo, pokonzekera zikalata zachuma, izi zimakupatsani mwayi wopulumuka nthawi yakusintha kwa malamulo. Kupatula apo, tikamvetsetsa bwino kudalira kwapakati pa mawonekedwe azinthu zamtundu watsatanetsatane, m'pamene tidzakumana ndi zolakwika "mwadzidzidzi" ndikuchepetsa kuchuluka kwa kukonzanso.

Kugwiritsa ntchito ma code otsika pamapulatifomu owunikira

Quality Data & Low-code

Ntchito ina yomwe idakhazikitsidwa ndi chida chotsika kwambiri pa projekiti ya Mediascope inali ntchito ya kalasi ya Data Quality. Mbali yapadera ya kukhazikitsidwa kwa payipi yotsimikizira deta ya polojekiti ya kampani yofufuza inali kusowa kwa zotsatira pa ntchito ndi liwiro la kuwerengera kwakukulu kwa deta. Kuti muthe kukonza zoyendera zotsimikizira za data, Apache Airflow yodziwika kale idagwiritsidwa ntchito. Pamene gawo lililonse la kupanga deta linali lokonzeka, gawo lina la pipeline la DQ linayambika mofanana.

Zimaganiziridwa kuti ndizochita zabwino kuyang'anira kuchuluka kwa deta kuyambira pomwe idakhazikitsidwa papulatifomu yowunikira. Pokhala ndi chidziwitso chokhudza metadata, titha kuyang'ana kutsatiridwa ndi zofunikira kuyambira pomwe chidziwitsocho chikulowa mugawo loyamba - osati zopanda pake, zoletsa, makiyi akunja. Izi zimakhazikitsidwa potengera mapu opangidwa okha a banja lamtundu wa data mu Datagram. Kupanga ma code pankhaniyi kumatengeranso metadata yachitsanzo. Pa pulojekiti ya Mediascope, mawonekedwewo adachitika ndi metadata ya Enterprise Architect product.

Polumikiza chida chotsika kwambiri ndi Enterprise Architect, macheke otsatirawa adangopangidwa:

  • Kuyang'ana kukhalapo kwa "zopanda pake" m'magawo okhala ndi "osati null" modifier;
  • Kuwona kukhalapo kwa zobwereza za kiyi yoyamba;
  • Kuwona kiyi yakunja ya bungwe;
  • Kuyang'ana mawonekedwe apadera a chingwe kutengera magawo angapo.

Kuti mufufuze zovuta za kupezeka ndi kudalirika kwa deta, mapu adapangidwa ndi Scala Expression, yomwe imatengera cheke chakunja cha Spark SQL chokonzedwa ndi akatswiri ku Zeppelin.

Kugwiritsa ntchito ma code otsika pamapulatifomu owunikira

Zowona, kupanga macheke kuyenera kuchitika pang'onopang'ono. Mkati mwa dongosolo la polojekiti yomwe yafotokozedwa, izi zidatsatiridwa ndi njira zotsatirazi:

  • DQ ikugwiritsidwa ntchito m'mabuku a Zeppelin;
  • DQ yomangidwa pamapu;
  • DQ mu mawonekedwe a mapu osiyana akuluakulu okhala ndi macheke amtundu wina;
  • Universal parameterized DQ mappings omwe amavomereza zambiri za metadata ndi macheke abizinesi ngati zolowetsa.

Mwina mwayi waukulu wopanga ma cheke a parameterized ndikuchepetsa nthawi yomwe imafunika kuti ipereke magwiridwe antchito kumalo opangira. Macheke amtundu watsopano amatha kulambalala njira yakale yoperekera ma code mosadukizadukiza ndi malo oyesera:

  • Macheke onse a metadata amapangidwa okha pamene chitsanzocho chasinthidwa mu EA;
  • Kufufuza kwa kupezeka kwa deta (kuzindikira kukhalapo kwa deta iliyonse panthawi imodzi) kungapangidwe pogwiritsa ntchito bukhu lomwe limasungira nthawi yomwe ikuyembekezeka kuwonekera kwa deta yotsatira pazochitika za zinthu;
  • Macheke otsimikizira zabizinesi amapangidwa ndi akatswiri m'mabuku a Zeppelin. Kuchokera pamenepo amatumizidwa mwachindunji ku matebulo okhazikitsa ma module a DQ m'malo opanga.

Palibe zoopsa zotumiza zolembedwa mwachindunji kukupanga. Ngakhale ndi zolakwika za syntax, pazipita zomwe zimatiwopseza ndikulephera kuchita cheke chimodzi, chifukwa kuwerengera kwa data ndi kutuluka kwa cheke chaubwino kumasiyanitsidwa.

M'malo mwake, ntchito ya DQ ikugwira ntchito kwanthawi zonse m'malo opanga ndipo yakonzeka kuyamba ntchito yake pomwe gawo lotsatira likuwonekera.

M'malo mapeto

Ubwino wogwiritsa ntchito ma code otsika ndi wodziwikiratu. Madivelopa safunikira kukulitsa pulogalamuyi kuyambira pachiyambi. Ndipo wopanga mapulogalamu omasulidwa ku ntchito zowonjezera amatulutsa zotsatira mwachangu. Kuthamanga, kumaperekanso nthawi yowonjezereka yothetsera mavuto. Chifukwa chake, munkhaniyi, mutha kudalira njira yabwinoko komanso yachangu.

Zachidziwikire, ma code otsika si mankhwala, ndipo zamatsenga sizichitika zokha:

  • Makampani otsika kwambiri akudutsa "kukhala amphamvu" siteji, ndipo palibe miyezo yamakampani yofananira panobe;
  • Mayankho ambiri otsika kwambiri sali aulere, ndipo kugula kwawo kuyenera kukhala gawo lozindikira, lomwe liyenera kupangidwa ndi chidaliro chonse pazachuma zogwiritsa ntchito;
  • Mayankho ambiri otsika samagwira ntchito bwino ndi GIT/SVN. Kapena ndizosavuta kugwiritsa ntchito ngati code yopangidwa yabisika;
  • Mukakulitsa zomangamanga, pangakhale kofunikira kuwongolera njira yotsika - yomwe, imayambitsa "kuphatikiza ndi kudalira" kwa omwe amapereka yankho lachidziwitso chochepa.
  • Mulingo wokwanira wachitetezo ndi wotheka, koma ndizovuta kwambiri komanso zovuta kukhazikitsa mu injini zamakina otsika. Mapulatifomu apansi amayenera kusankhidwa osati pa mfundo yokhayo yopezera phindu pakugwiritsa ntchito kwawo. Posankha, ndi bwino kufunsa mafunso okhudza kupezeka kwa magwiridwe antchito owongolera mwayi wopezeka ndi kutumiza / kukwera kwa zidziwitso pamlingo wa mawonekedwe onse a IT a bungwe.

Kugwiritsa ntchito ma code otsika pamapulatifomu owunikira

Komabe, ngati zolakwa zonse za dongosolo losankhidwa zikudziwika kwa inu, ndipo phindu la ntchito yake, komabe, lili mu ochuluka kwambiri, ndiye pita ku code yaying'ono popanda mantha. Komanso, kusintha kwa izo sikungapeweke - monga momwe chisinthiko chilili chonse sichingalephereke.

Ngati mkonzi mmodzi pa nsanja yotsika-code amachita ntchito yake mofulumira kuposa omanga awiri opanda code yochepa, ndiye izi zimapatsa kampaniyo mutu m'mbali zonse. Mpata wolowera njira zochepetsera zotsika ndi zochepa kusiyana ndi matekinoloje "achikhalidwe", ndipo izi zimakhala ndi zotsatira zabwino pa nkhani ya kuchepa kwa ogwira ntchito. Mukamagwiritsa ntchito zida zapansi, ndizotheka kufulumizitsa kuyanjana pakati pa magulu ogwira ntchito ndikupanga zisankho zofulumira ponena za kulondola kwa njira yosankhidwa ya kafukufuku wa sayansi ya deta. Mapulatifomu apansi amatha kuyendetsa kusintha kwa digito kwa bungwe chifukwa mayankho opangidwa amatha kumveka ndi akatswiri omwe si aukadaulo (makamaka ogwiritsa ntchito mabizinesi).

Ngati muli ndi nthawi yokhazikika, malingaliro odzaza bizinesi, kusowa kwaukadaulo, ndipo muyenera kufulumizitsa nthawi yanu yogulitsa, ndiye kuti nambala yotsika ndi njira imodzi yokwaniritsira zosowa zanu.

Palibe kutsutsa kufunikira kwa zida zachitukuko zachikhalidwe, koma nthawi zambiri, kugwiritsa ntchito njira zochepetsera zochepetsetsa ndiyo njira yabwino yowonjezeramo ntchito zomwe zikuthetsedwa.

Source: www.habr.com

Kuwonjezera ndemanga