MLOps - Cook book, mutu 1

MLOps - Cook book, mutu 1

Moni nonse! Ndine wopanga CV ku CROC. Takhala tikugwiritsa ntchito ma CV kwa zaka zitatu tsopano. Panthawiyi, tinkachita zinthu zambiri, mwachitsanzo: tinkayang'anira madalaivala kuti pamene akuyendetsa galimoto samamwa mowa, osasuta fodya, osalankhula pafoni, kuyang'ana msewu, osati maloto kapena mitambo. ; Tinajambulitsa anthu omwe amayendetsa m'misewu yodzipereka ndikutenga malo angapo oimika magalimoto; anaonetsetsa kuti ogwira ntchito amavala zipewa, magolovesi, ndi zina zotero; adazindikira wogwira ntchito yemwe akufuna kulowa m'malo; Tinawerengera zonse zomwe tingathe.

Kodi zonsezi ndikuchita chiyani?

Pokhazikitsa mapulojekiti, timakumana ndi zovuta, zovuta zambiri, mavuto ena omwe mumawadziwa kapena mudzawadziwa mtsogolo.

Tiyeni tiyerekeze mmene zinthu zilili

Tiyerekeze kuti tapeza ntchito ku kampani yachichepere "N", yomwe ntchito zake zimagwirizana ndi ML. Timagwira ntchito ya ML (DL, CV), ndiye pazifukwa zina timasinthira ku ntchito ina, nthawi zambiri timapuma, ndikubwerera ku neuron yathu kapena ya munthu wina.

  1. Mphindi ya chowonadi ikubwera, muyenera kukumbukira mwanjira ina komwe mudayima, ndi ma hyperparameter omwe mudayesa ndipo, chofunikira kwambiri, zomwe zidatsogolera. Pakhoza kukhala zosankha zambiri za omwe adasunga zidziwitso pazoyambitsa zonse: pamutu, ma configs, notepad, pamalo ogwirira ntchito mumtambo. Ndidawona mwayi pomwe ma hyperparameter adasungidwa monga mizere yofotokozera mu code, nthawi zambiri, kuthawa kwapamwamba. Tsopano tangoganizani kuti simunabwerere ku polojekiti yanu, koma ku polojekiti ya munthu amene anasiya kampaniyo ndipo munatengera kachidindo ndi chitsanzo chotchedwa model_1.pb. Kuti mumalize chithunzichi ndikuwonetsa zowawa zonse, tiyerekeze kuti ndinu katswiri woyamba.
  2. Chitani zomwezo. Kuti tigwiritse ntchito kachidindo, ife ndi onse omwe tidzagwire nawo ntchito tiyenera kupanga malo. Nthawi zambiri zimachitika kuti pazifukwa zina sanamusiye monga cholowa chathu. Izi zitha kukhalanso ntchito yosachepera. Simukufuna kutaya nthawi pa sitepe iyi, sichoncho?
  3. Timaphunzitsa chitsanzo (mwachitsanzo, chojambulira galimoto). Timafika poti zimakhala zabwino kwambiri - ndi nthawi yopulumutsa zotsatira. Tiyeni tizitcha car_detection_v1.pb. Kenako timaphunzitsa wina - car_detection_v2.pb. Patapita nthawi, anzathu kapena ife tokha timaphunzitsa kwambiri, pogwiritsa ntchito zomangamanga zosiyana. Zotsatira zake, zinthu zambiri zimapangidwa, zomwe zimayenera kusonkhanitsidwa mosamala (koma tidzachita izi pambuyo pake, chifukwa pakadali pano tili ndi zinthu zofunika kwambiri).
  4. CHABWINO zonse zatha Tsopano! Tili ndi chitsanzo! Kodi tingayambe kuphunzitsa chitsanzo chotsatira, kupanga zomangamanga kuti tithetse vuto latsopano, kapena tingapite kukamwa tiyi? Ndipo ndani adzatumiza?

Kuzindikira mavuto

Kugwira ntchito kapena chinthu ndi ntchito ya anthu ambiri. Ndipo pakapita nthawi, anthu amachoka ndi kubwera, pali ntchito zambiri, ndipo ntchitozo zimakhala zovuta kwambiri. Njira imodzi kapena imzake, zochitika kuchokera kumayendedwe omwe tafotokozazi (osati kokha) pazophatikizira zina zidzachitika kuchokera kubwereza mpaka kubwereza. Zonsezi zimabweretsa nthawi yowonongeka, chisokonezo, mitsempha, mwina kusakhutira kwamakasitomala, ndipo pamapeto pake, kutaya ndalama. Ngakhale tonse nthawi zambiri timatsatira njira yakale yomweyi, ndikukhulupirira kuti palibe amene amafuna kubwereza nthawizi.

MLOps - Cook book, mutu 1

Choncho, tadutsa njira imodzi yachitukuko ndipo tikuwona kuti pali mavuto omwe akuyenera kuthetsedwa. Kuti muchite izi muyenera:

  • sungani bwino zotsatira za ntchito;
  • kupanga njira yophatikizira antchito atsopano kukhala yosavuta;
  • kufewetsa njira yotumizira malo otukuka;
  • konza ndondomeko yomasulira chitsanzo;
  • kukhala ndi njira yabwino yotsimikizira zitsanzo;
  • pezani chida chowongolera boma lachitsanzo;
  • kupeza njira yoperekera zitsanzo pakupanga.

Zikuwoneka kuti ndikofunikira kuti mubwere ndi kayendetsedwe ka ntchito komwe kungakuthandizireni kuti muzitha kuyendetsa bwino moyo uno? Mchitidwewu umatchedwa MLOps

MLOps, kapena DevOps pophunzira makina, imalola sayansi ya data ndi magulu a IT kuti agwirizane ndi kuonjezera liwiro lachitsanzo ndi kutumizidwa kupyolera mu kuyang'anira, kutsimikizira, ndi kulamulira kwa mitundu yophunzirira makina.

Mungathe werenganiKodi anyamata a Google amaganiza chiyani pa zonsezi? Kuchokera m'nkhaniyi zikuwonekeratu kuti MLOps ndi chinthu chovuta kwambiri.

MLOps - Cook book, mutu 1

Komanso m'nkhani yanga ndikufotokozera gawo limodzi la ndondomekoyi. Kuti ndikwaniritse, ndigwiritsa ntchito chida cha MLflow, chifukwa ... Iyi ndi pulojekiti yotseguka, kachidindo kakang'ono kamene kamafunika kuti mugwirizane ndipo pali kusakanikirana ndi machitidwe otchuka a ml. Mutha kusaka pa intaneti pazida zina, monga Kubeflow, SageMaker, Sitima, ndi zina zambiri, ndipo mwina kupeza zomwe zikugwirizana ndi zosowa zanu.

"Kumanga" MLOps pogwiritsa ntchito chitsanzo cha MLFlow chida

MLFlow ndi nsanja yotseguka yoyendetsera moyo wamitundu yama ml (https://mlflow.org/).

MLflow ili ndi zigawo zinayi:

  • Kutsata kwa MLflow - kumakhudza nkhani zojambulira zotsatira ndi magawo omwe adayambitsa izi;
  • MLflow Project - imakulolani kuti muyike kachidindo ndikuchipanganso papulatifomu iliyonse;
  • Ma Model a MLflow - omwe ali ndi udindo wotumiza mitundu kuti apange;
  • MLflow Registry - imakulolani kuti musunge zitsanzo ndikuwongolera malo awo pamalo osungiramo.

MLflow imagwira ntchito pazinthu ziwiri:

  • kukhazikitsa ndi kuzungulira kwathunthu kwa maphunziro, magawo ndi ma metric omwe tikufuna kulembetsa;
  • Kuyesera ndi "mutu" womwe umayendera limodzi.

Masitepe onse achitsanzo akugwiritsidwa ntchito pa Ubuntu 18.04 system.

1. Ikani seva

Kuti titha kuyang'anira ntchito yathu mosavuta ndikulandila zidziwitso zonse zofunika, tidzatumiza seva. Seva yotsata MLflow ili ndi zigawo ziwiri zazikulu:

  • sitolo yakumbuyo - yomwe ili ndi udindo wosunga zambiri zamitundu yolembetsedwa (imathandizira 4 DBMSs: mysql, mssql, sqlite, ndi postgresql);
  • sitolo yosungiramo zinthu zakale - yomwe ili ndi udindo wosunga zinthu zakale (imathandizira zosankha 7 zosungirako: Amazon S3, Azure Blob Storage, Google Cloud Storage, seva ya FTP, SFTP Server, NFS, HDFS).

Monga sitolo yamakono Kuti zikhale zosavuta, tiyeni titenge seva ya sftp.

  • pangani gulu
    $ sudo groupadd sftpg
  • onjezani wosuta ndikumuyikira mawu achinsinsi
    $ sudo useradd -g sftpg mlflowsftp
    $ sudo passwd mlflowsftp 
  • kusintha makonda angapo
    $ sudo mkdir -p /data/mlflowsftp/upload
    $ sudo chown -R root.sftpg /data/mlflowsftp
    $ sudo chown -R mlflowsftp.sftpg /data/mlflowsftp/upload
  • onjezani mizere ingapo ku /etc/ssh/sshd_config
    Match Group sftpg
     ChrootDirectory /data/%u
     ForceCommand internal-sftp
  • yambitsaninso ntchito
    $ sudo systemctl restart sshd

Monga sitolo yakumbuyo Tiyeni titenge postgresql.

$ sudo apt update
$ sudo apt-get install -y postgresql postgresql-contrib postgresql-server-dev-all
$ sudo apt install gcc
$ pip install psycopg2
$ sudo -u postgres -i
# Create new user: mlflow_user
[postgres@user_name~]$ createuser --interactive -P
Enter name of role to add: mlflow_user
Enter password for new role: mlflow
Enter it again: mlflow
Shall the new role be a superuser? (y/n) n
Shall the new role be allowed to create databases? (y/n) n
Shall the new role be allowed to create more new roles? (y/n) n
# Create database mlflow_bd owned by mlflow_user
$ createdb -O mlflow_user mlflow_db

Kuti muyambitse seva, muyenera kukhazikitsa maphukusi otsatirawa a python (Ndikupangira kupanga malo osiyana):

pip install mlflow
pip install pysftp

Tiyeni tiyambe seva yathu

$ mlflow server  
                 --backend-store-uri postgresql://mlflow_user:mlflow@localhost/mlflow_db 
                 --default-artifact-root sftp://mlflowsftp:mlflow@sftp_host/upload  
                --host server_host 
                --port server_port

2. Onjezani kutsatira

Kuti zotsatira za maphunziro athu asatayike, mibadwo yamtsogolo ya otukula kuti amvetse zomwe zikuchitika, komanso kuti abwenzi achikulire ndi inu muthe kusanthula modekha maphunzirowo, tiyenera kuwonjezera kutsatira. Kutsata kumatanthauza kupulumutsa magawo, ma metrics, zinthu zakale ndi zina zowonjezera zokhudza chiyambi cha maphunziro, ife, pa seva.

Mwachitsanzo, ndinapanga kakang'ono polojekiti pa github pa Keras pogawa chilichonse chomwe chili mkati Chithunzi cha COCO. Kuti muwonjezere kutsatira, ndidapanga fayilo mlflow_training.py.

Nayi mizere yomwe zinthu zosangalatsa kwambiri zimachitika:

def run(self, epochs, lr, experiment_name):
        # getting the id of the experiment, creating an experiment in its absence
        remote_experiment_id = self.remote_server.get_experiment_id(name=experiment_name)
        # creating a "run" and getting its id
        remote_run_id = self.remote_server.get_run_id(remote_experiment_id)

        # indicate that we want to save the results on a remote server
        mlflow.set_tracking_uri(self.tracking_uri)
        mlflow.set_experiment(experiment_name)

        with mlflow.start_run(run_id=remote_run_id, nested=False):
            mlflow.keras.autolog()
            self.train_pipeline.train(lr=lr, epochs=epochs)

        try:
            self.log_tags_and_params(remote_run_id)
        except mlflow.exceptions.RestException as e:
            print(e)

Apa self.remote_server ndi chotchingira chaching'ono pa njira za mlflow.tracking. MlflowClient (Ndinapanga izo kuti zikhale zosavuta), mothandizidwa ndi zomwe ndimapanga kuyesa ndikuyendetsa pa seva. Kenako, ndikuwonetsa komwe zotsatira zoyambitsa ziyenera kuphatikizidwa (mlflow.set_tracking_uri(self.tracking_uri)). Ndimayatsa mitengo yokhayokha mlflow.keras.autolog(). Pakalipano MLflow Tracking imathandizira kudula mitengo kwa TensorFlow, Keras, Gluon XGBoost, LightGBM, Spark. Ngati simunapeze chimango kapena laibulale yanu, ndiye kuti mutha kulowa nthawi zonse momveka bwino. Tikuyamba maphunziro. Lembani ma tag ndi magawo olowetsa pa seva yakutali.

Mizere ingapo ndipo inu, monga wina aliyense, mumatha kudziwa zambiri zoyambitsa zonse. Zabwino?

3. Timajambula polojekitiyi

Tsopano tiyeni tikhale osavuta kuyambitsa pulojekiti. Kuti muchite izi, onjezani fayilo ya MLproject ndi conda.yaml ku mizu ya polojekiti.
MLproject

name: flow_segmentation
conda_env: conda.yaml

entry_points:
  main:
    parameters:
        categories: {help: 'list of categories from coco dataset'}
        epochs: {type: int, help: 'number of epochs in training'}

        lr: {type: float, default: 0.001, help: 'learning rate'}
        batch_size: {type: int, default: 8}
        model_name: {type: str, default: 'Unet', help: 'Unet, PSPNet, Linknet, FPN'}
        backbone_name: {type: str, default: 'resnet18', help: 'exampe resnet18, resnet50, mobilenetv2 ...'}

        tracking_uri: {type: str, help: 'the server address'}
        experiment_name: {type: str, default: 'My_experiment', help: 'remote and local experiment name'}
    command: "python mlflow_training.py 
            --epochs={epochs}
            --categories={categories}
            --lr={lr}
            --tracking_uri={tracking_uri}
            --model_name={model_name}
            --backbone_name={backbone_name}
            --batch_size={batch_size}
            --experiment_name={experiment_name}"

MLflow Project ili ndi zinthu zingapo:

  • Dzina - dzina la polojekiti yanu;
  • Chilengedwe - kwa ine, conda_env ikuwonetsa kuti Anaconda imagwiritsidwa ntchito ndipo kufotokozera kudalira kuli mu fayilo ya conda.yaml;
  • Mfundo Zolowera - zikuwonetsa mafayilo ndi magawo omwe titha kuyendetsa (magawo onse amalowetsedwa poyambira maphunziro)

conda.yaml

name: flow_segmentation
channels:
  - defaults
  - anaconda
dependencies:
  - python==3.7
  - pip:
    - mlflow==1.8.0
    - pysftp==0.2.9
    - Cython==0.29.19
    - numpy==1.18.4
    - pycocotools==2.0.0
    - requests==2.23.0
    - matplotlib==3.2.1
    - segmentation-models==1.0.1
    - Keras==2.3.1
    - imgaug==0.4.0
    - tqdm==4.46.0
    - tensorflow-gpu==1.14.0

Mutha kugwiritsa ntchito docker ngati malo anu othamanga, kuti mumve zambiri chonde onani zolemba.

4. Tiyeni tiyambe kuphunzitsa

Timagwirizanitsa polojekitiyi ndikupita ku chikwatu cha polojekiti:

git clone https://github.com/simbakot/mlflow_example.git
cd mlflow_example/

Kuti mugwiritse ntchito muyenera kukhazikitsa malaibulale

pip install mlflow
pip install pysftp

Chifukwa mu chitsanzo chomwe ndimagwiritsa ntchito conda_env, Anaconda iyenera kukhazikitsidwa pa kompyuta yanu (koma mutha kuzungulira izi mwa kukhazikitsa nokha mapaketi oyenera ndikusewera ndi magawo oyambira).

Zonse zokonzekera zatha ndipo tikhoza kuyamba kuyambitsa maphunziro. Kuchokera muzu wa polojekiti:

$ mlflow run -P epochs=10 -P categories=cat,dog -P tracking_uri=http://server_host:server_port .

Pambuyo polowa lamulo, malo a conda adzapangidwa okha ndipo maphunziro ayamba.
Muchitsanzo pamwambapa, ndadutsa kuchuluka kwa nthawi zophunzitsira, magulu omwe tikufuna kuwagawa (mutha kuwona mndandanda wathunthu. apa) ndi adilesi ya seva yathu yakutali.
Mndandanda wathunthu wazomwe zingatheke zitha kupezeka mufayilo ya MLproject.

5. Unikani zotsatira za maphunziro

Mukamaliza maphunzirowa, titha kupita mumsakatuli ku adilesi ya seva yathu http://server_host:server_port

MLOps - Cook book, mutu 1

Apa tikuwona mndandanda wazoyeserera zonse (kumanzere kumanzere), komanso zambiri pamayendedwe (pakati). Titha kuwona zambiri zatsatanetsatane (magawo, ma metric, zinthu zakale ndi zina zowonjezera) pakukhazikitsa kulikonse.

MLOps - Cook book, mutu 1

Pa metric iliyonse titha kuwona mbiri yakusintha

MLOps - Cook book, mutu 1

Iwo. Pakalipano, tikhoza kusanthula zotsatira mu "manual" mode, ndipo mukhoza kukhazikitsa zovomerezeka pogwiritsa ntchito MLflow API.

6. Lembani chitsanzo

Titasanthula chitsanzo chathu ndikusankha kuti ndi okonzeka kumenya nkhondo, timapitiliza kulembetsa, chifukwa cha izi timasankha kukhazikitsa komwe tikufuna (monga tawonetsera m'ndime yapitayi) ndikupita pansi.

MLOps - Cook book, mutu 1

Titapatsa chitsanzo chathu dzina, chimakhala ndi mtundu wake. Ngati musunga mtundu wina wokhala ndi dzina lomwelo, mtunduwo umangosinthidwa.

MLOps - Cook book, mutu 1

Pachitsanzo chilichonse, titha kuwonjezera kufotokozera ndikusankha chimodzi mwa zigawo zitatu (Staging, Production, Archived); pambuyo pake, pogwiritsa ntchito API, titha kupeza mayiko awa, omwe, pamodzi ndi kumasulira, amapereka kusinthasintha kwina.

MLOps - Cook book, mutu 1

Timakhalanso ndi mwayi wopeza zitsanzo zonse

MLOps - Cook book, mutu 1

ndi matembenuzidwe awo

MLOps - Cook book, mutu 1

Monga m'ndime yapitayi, ntchito zonse zikhoza kuchitika pogwiritsa ntchito API.

7. Ikani chitsanzo

Pa nthawiyi, tili ndi chitsanzo (keras) chophunzitsidwa kale. Chitsanzo cha momwe mungagwiritsire ntchito:

class SegmentationModel:
    def __init__(self, tracking_uri, model_name):

        self.registry = RemoteRegistry(tracking_uri=tracking_uri)
        self.model_name = model_name
        self.model = self.build_model(model_name)

    def get_latest_model(self, model_name):
        registered_models = self.registry.get_registered_model(model_name)
        last_model = self.registry.get_last_model(registered_models)
        local_path = self.registry.download_artifact(last_model.run_id, 'model', './')
        return local_path

    def build_model(self, model_name):
        local_path = self.get_latest_model(model_name)

        return mlflow.keras.load_model(local_path)

    def predict(self, image):
        image = self.preprocess(image)
        result = self.model.predict(image)
        return self.postprocess(result)

    def preprocess(self, image):
        image = cv2.resize(image, (256, 256))
        image = image / 255.
        image = np.expand_dims(image, 0)
        return image

    def postprocess(self, result):
        return result

Pano self.registry ilinso kapupa kakang'ono pamwamba pa mlflow.tracking.MlflowClient, kuti zitheke. Mfundo ndi yakuti ndimapeza seva yakutali ndikuyang'ana chitsanzo kumeneko ndi dzina lotchulidwa, ndi mtundu waposachedwa kwambiri. Kenako, ndimatsitsa chopangidwako kwanuko ku foda ya ./model ndi kupanga choyimira kuchokera mu bukhuli mlflow.keras.load_model(local_path). Tsopano tikhoza kugwiritsa ntchito chitsanzo chathu. Madivelopa a CV (ML) amatha kusintha mtunduwo mosavuta ndikusindikiza mitundu yatsopano.

Pomaliza

Ndinapereka ndondomeko yomwe imalola kuti:

  • sungani zambiri zamitundu ya ML, kupita patsogolo kwa maphunziro ndi zotsatira;
  • tumizani mwachangu malo achitukuko;
  • kuyang'anira ndi kusanthula momwe ntchito ikuyendera pazitsanzo;
  • ndikosavuta kusintha ndikuwongolera mawonekedwe amitundu;
  • N'zosavuta atumiza chifukwa zitsanzo.

Chitsanzo ichi ndi chidole ndipo chimakhala ngati poyambira pomanga makina anu, omwe angaphatikizepo kuwunikira zotsatira ndikulembetsa zitsanzo (mfundo 5 ndi 6, motsatana) kapena mudzawonjezera kusinthidwa kwa ma dataset, kapena china chake ? Mfundo yomwe ndimayesera kupanga ndikuti mufunika MLOps yonse, MLflow ndi njira yofikira.

Lembani mavuto omwe mudakumana nawo omwe sindinawawonetse?
Kodi mungawonjezere chiyani ku dongosololi kuti likwaniritse zosowa zanu?
Ndi zida ndi njira ziti zomwe mumagwiritsa ntchito kuthetsa mavuto onse kapena gawo limodzi?

PS ndisiya maulalo angapo:
github polojekiti - https://github.com/simbakot/mlflow_example
MLflow - https://mlflow.org/
Imelo yanga yantchito yamafunso - [imelo ndiotetezedwa]

Kampani yathu nthawi ndi nthawi imakhala ndi zochitika zosiyanasiyana za akatswiri a IT, mwachitsanzo: pa Julayi 8 nthawi ya 19:00 ku Moscow padzakhala msonkhano wa CV pa intaneti, ngati mukufuna, mutha kutenga nawo gawo, kulembetsa. apa .

Source: www.habr.com

Kuwonjezera ndemanga