MLOps - Bika bhuku, chitsauko 1

MLOps - Bika bhuku, chitsauko 1

Mhoroi mose! Ini ndiri mugadziri weCV kuCROC. Isu tanga tichiita mapurojekiti mumunda weCV kwemakore matatu izvozvi. Panguva iyi taiita zvinhu zvakawanda semuenzaniso: taiongorora vatyairi kuti pavanenge vachityaira vaisanwa doro, vaisaputa fodya, vaisataura parunhare, vakatarisa mugwagwa, kwete kurota kana makore. ; Takarekodha vanhu vanotyaira mumigwagwa yakatsaurirwa uye vanotora nzvimbo dzinoverengeka dzekupaka; kuve nechokwadi chekuti vashandi vaipfeka ngowani, magurovhosi, nezvimwe; akaziva mushandi anoda kupinda munzvimbo; Takaverenga zvose zvataigona.

Ndiri kuitei zvese izvi?

Mukuita mapurojekiti, tinorova mabumps, mabhambu mazhinji, mamwe ematambudziko aungave wajaira kana anozojairana nawo mune ramangwana.

Ngatitevedzerei mamiriro acho ezvinhu

Ngatimbofungidzira kuti takawana basa pakambani yechidiki "N", iyo mabasa ayo ane chokuita neML. Isu tinoshanda pane ML (DL, CV) chirongwa, saka nekuda kwechimwe chikonzero tinochinjira kune rimwe basa, kazhinji kutora zororo, uye kudzokera kune yedu kana yeumwe munhu neuron.

  1. Iyo nguva yechokwadi inouya, iwe unofanirwa kurangarira pawakamira, ndeapi ma hyperparameter awakaedza uye, zvakanyanya kukosha, kuti ndeapi mhedzisiro yavakatungamira. Panogona kuve nesarudzo dzakawanda dzekuti ndiani akachengeta ruzivo pane zvese zvinotangwa: mumusoro, configs, notepad, munzvimbo yekushanda mugore. Ndakaita kuti ndione sarudzo apo ma hyperparameter akachengetwa semitsetse yakatsanangurwa mukodhi, kazhinji, kubhururuka kwe fancy. Iye zvino fungidzira kuti iwe hauna kudzokera kupurojekiti yako, asi kune purojekiti yemunhu akasiya kambani uye iwe wakagara nhaka yekodhi uye muenzaniso unonzi model_1.pb. Kupedzisa mufananidzo uye kuburitsa marwadzo ese, ngatimbofungidzira kuti iwe uriwo nyanzvi yekutanga.
  2. Enderera mberi. Kumhanyisa kodhi, isu uye wese achashanda nayo tinoda kugadzira nharaunda. Zvinowanzoitika kuti nokuda kwechimwe chikonzero havana kumusiya senhaka yedu. Izvi zvinogona zvakare kuve basa risiri diki. Iwe hausi kuda kutambisa nguva padanho iri, unodaro?
  3. Isu tinodzidzisa muenzaniso (somuenzaniso, detector yemotokari). Isu tinosvika panzvimbo iyo inova yakanaka kwazvo - inguva yekuchengetedza mhedzisiro. Ngatizviti car_detection_v1.pb. Zvadaro tinodzidzisa imwe - car_detection_v2.pb. Imwe nguva gare gare, vatinoshanda navo kana isu pachedu tinodzidzisa zvakanyanya, tichishandisa zvivakwa zvakasiyana. Nekuda kweizvozvo, boka rezvigadzirwa zvinogadzirwa, ruzivo nezve izvo zvinofanirwa kuunganidzwa zvine hungwaru (asi isu tichaita izvi gare gare, nekuti ikozvino tine zvimwe zvinonyanya kukosha).
  4. OK zvose zvapera Zvino! Tine muenzaniso! Tinogona here kutanga kudzidzisa iyo inotevera modhi, kugadzira chivakwa chekugadzirisa dambudziko idzva, kana isu tingaenda kunwa tii? Uye ndiani achatumira?

Kuziva matambudziko

Kushanda purojekiti kana chigadzirwa ibasa revanhu vazhinji. Uye nekufamba kwenguva, vanhu vanobva uye vanouya, kune mamwe mapurojekiti, uye mapurojekiti pachawo anowedzera kuoma. Imwe nzira kana imwe, mamiriro kubva kudenderedzwa rinotsanangurwa pamusoro (uye kwete chete) mune mamwe masanganiswa achaitika kubva pakudzokororwa kuenda pakudzokorora. Zvese izvi zvinokonzeresa kutambisa nguva, kuvhiringika, tsinga, pamwe kusagutsikana kwevatengi, uye pakupedzisira, kurasikirwa nemari. Kunyangwe isu tese tinowanzo kutevedzera yekare reki, ndinotenda kuti hapana anoda kudzokorora nguva idzi kakawanda.

MLOps - Bika bhuku, chitsauko 1

Saka, takapfuura nechikamu chimwe chebudiriro uye tinoona kuti pane matambudziko anoda kugadziriswa. Kuti aite izvi unofanira:

  • chengetedza zvakanaka mhedzisiro yebasa;
  • ita kuti maitiro ekubatanidza vashandi vatsva ave nyore;
  • kurerutsa maitiro ekuisa nharaunda yekusimudzira;
  • gadzirisa maitiro ekushandura maitiro;
  • vane nzira iri nyore yekusimbisa mamodheru;
  • tsvaga chigadziriso chekutonga chehurumende;
  • tsvaga nzira yekuendesa mamodheru pakugadzira.

Sezviri pachena zvinodikanwa kuti uuye nekufambiswa kwebasa izvo zvinokutendera iwe kuti ugone kutonga zviri nyore uye zviri nyore kutenderera kwehupenyu uku? Kuita uku kunonzi MLOps

MLOps, kana DevOps yekudzidza muchina, inobvumira sainzi yedata uye zvikwata zveIT kuti zvibatane uye kuwedzera nhanho yekuvandudza modhi uye kutumira kuburikidza nekutarisa, kusimbisa, uye hutongi hwemamodhi ekudzidza muchina.

Unogona kukudzaKo vakomana veGoogle vanofungei nezvese izvi? Kubva kuchinyorwa zviri pachena kuti MLOps chinhu chakaoma.

MLOps - Bika bhuku, chitsauko 1

Kuwedzera mune yangu chinyorwa ini ndichatsanangura chete chikamu chemaitiro. Nekuita, ini ndichashandisa iyo MLflow chishandiso, nekuti... Iyi ipurojekiti yakavhurika-sosi, idiki kodhi inodiwa kuti ubatanidze uye pane kubatanidzwa neanozivikanwa ml masisitimu. Iwe unogona kutsvaga paInternet kune mamwe maturusi, akadai Kubeflow, SageMaker, Zvitima, nezvimwe, uye pamwe kuwana imwe inokodzera zvirinani zvaunoda.

"Kuvaka" MLOps uchishandisa muenzaniso wekushandisa iyo MLFlow chishandiso

MLFlow inzvimbo yakavhurika sosi yehupenyu hutungamiriri hwemaML modhi (https://mlflow.org/).

MLflow inosanganisira zvinhu zvina:

  • MLflow Tracking - inovhara nyaya dzekurekodha mhedzisiro uye paramita zvakatungamira kune ichi mhedzisiro;
  • MLflow Project - inobvumidza iwe kurongedza kodhi uye kuiburitsa pane chero chikuva;
  • MLflow Models - inotarisira kuendesa modhi mukugadzira;
  • MLflow Registry - inokutendera iwe kuchengeta modhi uye kutonga nyika yavo mune yepakati repository.

MLflow inoshanda pazvikamu zviviri:

  • kutanga kutenderera kuzere kwekudzidziswa, paramita uye metrics yatinoda kunyoresa;
  • Kuedza i "nyaya" inomhanya pamwe chete.

Matanho ese emuenzaniso anoitwa paUbuntu 18.04 system yekushandisa.

1. Shandisa sevha

Kuti isu titore nyore purojekiti yedu uye tigamuchire ruzivo rwese rwunodiwa, isu tichaendesa sevha. MLflow yekutevera server ine zvikamu zviviri zvikuru:

  • backend store - inotarisira kuchengetedza ruzivo pamusoro pemhando dzakanyoreswa (inotsigira 4 DBMSs: mysql, mssql, sqlite, uye postgresql);
  • artifact store - inotarisira kuchengetedza zvigadzirwa (inotsigira 7 kuchengetedza sarudzo: Amazon S3, Azure Blob Storage, Google Cloud Storage, FTP server, SFTP Server, NFS, HDFS).

Sezvo a artifact store Kuti zvive nyore, ngatitore sftp server.

  • gadzira boka
    $ sudo groupadd sftpg
  • wedzera mushandisi uye isa password kwaari
    $ sudo useradd -g sftpg mlflowsftp
    $ sudo passwd mlflowsftp 
  • kugadzirisa akati wandei ekuwana marongero
    $ sudo mkdir -p /data/mlflowsftp/upload
    $ sudo chown -R root.sftpg /data/mlflowsftp
    $ sudo chown -R mlflowsftp.sftpg /data/mlflowsftp/upload
  • wedzera mitsetse mishoma ku /etc/ssh/sshd_config
    Match Group sftpg
     ChrootDirectory /data/%u
     ForceCommand internal-sftp
  • tangazve sevhisi
    $ sudo systemctl restart sshd

Sezvo a backend store Ngatitorei postgresql.

$ sudo apt update
$ sudo apt-get install -y postgresql postgresql-contrib postgresql-server-dev-all
$ sudo apt install gcc
$ pip install psycopg2
$ sudo -u postgres -i
# Create new user: mlflow_user
[postgres@user_name~]$ createuser --interactive -P
Enter name of role to add: mlflow_user
Enter password for new role: mlflow
Enter it again: mlflow
Shall the new role be a superuser? (y/n) n
Shall the new role be allowed to create databases? (y/n) n
Shall the new role be allowed to create more new roles? (y/n) n
# Create database mlflow_bd owned by mlflow_user
$ createdb -O mlflow_user mlflow_db

Kuti utange sevha, iwe unofanirwa kuisa anotevera python mapakeji (Ini ndinokurudzira kugadzira yakaparadzana chaiyo nharaunda):

pip install mlflow
pip install pysftp

Ngatitange server yedu

$ mlflow server  
                 --backend-store-uri postgresql://mlflow_user:mlflow@localhost/mlflow_db 
                 --default-artifact-root sftp://mlflowsftp:mlflow@sftp_host/upload  
                --host server_host 
                --port server_port

2. Wedzera kutevera

Kuti mhedzisiro yekudzidziswa kwedu irege kurasika, zvizvarwa zveramangwana zvevagadziri vanzwisise zvaiitika, uye kuti ma comrades akura uye iwe ugone kudzikamisa kuongorora maitiro ekudzidza, isu tinofanirwa kuwedzera tracking. Kutevera kunoreva kuchengetedza paramita, metrics, zvigadzirwa uye chero rumwe ruzivo nezve kutanga kwekudzidziswa, mune yedu, pane server.

Semuenzaniso, ndakagadzira diki purojekiti pagithub paKeras yekugovera zvese zviri mukati Nhoroondo ye COCO. Kuwedzera kuronda, ndakagadzira faira mlflow_training.py.

Heino mitsara apo zvinonyanya kufadza zvinoitika:

def run(self, epochs, lr, experiment_name):
        # getting the id of the experiment, creating an experiment in its absence
        remote_experiment_id = self.remote_server.get_experiment_id(name=experiment_name)
        # creating a "run" and getting its id
        remote_run_id = self.remote_server.get_run_id(remote_experiment_id)

        # indicate that we want to save the results on a remote server
        mlflow.set_tracking_uri(self.tracking_uri)
        mlflow.set_experiment(experiment_name)

        with mlflow.start_run(run_id=remote_run_id, nested=False):
            mlflow.keras.autolog()
            self.train_pipeline.train(lr=lr, epochs=epochs)

        try:
            self.log_tags_and_params(remote_run_id)
        except mlflow.exceptions.RestException as e:
            print(e)

Pano self.remote_server kapepa kadiki pamusoro pemitoo yemlflow.tracking. MlflowClient (Ndakaita kuti ive nyore), nerubatsiro rwandinogadzira kuyedza uye kumhanyisa pane sevha. Tevere, ini ndinoratidza panofanirwa kubatanidzwa mibairo yekutangisa (mlflow.set_tracking_uri(self.tracking_uri)). Ini ndinogonesa otomatiki kutema mlflow.keras.autolog(). Parizvino MLflow Tracking inotsigira kutema otomatiki kweTensorFlow, Keras, Gluon XGBoost, LightGBM, Spark. Kana usati wawana chimiro chako kana raibhurari, saka unogona kugara uchinyora zvakajeka. Tiri kutanga kudzidzira. Bhalisa ma tag uye ma parameter ekuisa pane iri kure server.

Mitsetse yakati wandei uye iwe, sevamwe vese, unokwanisa kuwana ruzivo nezve zvese zvinotangwa. Kutonhora?

3. Isu tinodhirowa chirongwa

Zvino ngatiite kuti zvive nyore kutanga chirongwa. Kuti uite izvi, wedzera iyo MLproject uye conda.yaml faira kumudzi weprojekiti.
MLproject

name: flow_segmentation
conda_env: conda.yaml

entry_points:
  main:
    parameters:
        categories: {help: 'list of categories from coco dataset'}
        epochs: {type: int, help: 'number of epochs in training'}

        lr: {type: float, default: 0.001, help: 'learning rate'}
        batch_size: {type: int, default: 8}
        model_name: {type: str, default: 'Unet', help: 'Unet, PSPNet, Linknet, FPN'}
        backbone_name: {type: str, default: 'resnet18', help: 'exampe resnet18, resnet50, mobilenetv2 ...'}

        tracking_uri: {type: str, help: 'the server address'}
        experiment_name: {type: str, default: 'My_experiment', help: 'remote and local experiment name'}
    command: "python mlflow_training.py 
            --epochs={epochs}
            --categories={categories}
            --lr={lr}
            --tracking_uri={tracking_uri}
            --model_name={model_name}
            --backbone_name={backbone_name}
            --batch_size={batch_size}
            --experiment_name={experiment_name}"

MLflow Project ine akati wandei zvivakwa:

  • Zita - zita repurojekiti yako;
  • Zvakatipoteredza - kwandiri, conda_env inoratidza kuti Anaconda inoshandiswa kumhanya uye tsananguro yekutsamira iri muconda.yaml faira;
  • Entry Points - inoratidza kuti ndeapi mafaera uye neapi maparamendi atinogona kumhanya (ese ma paramita anongoiswa otomatiki kana atanga kudzidziswa)

conda.yaml

name: flow_segmentation
channels:
  - defaults
  - anaconda
dependencies:
  - python==3.7
  - pip:
    - mlflow==1.8.0
    - pysftp==0.2.9
    - Cython==0.29.19
    - numpy==1.18.4
    - pycocotools==2.0.0
    - requests==2.23.0
    - matplotlib==3.2.1
    - segmentation-models==1.0.1
    - Keras==2.3.1
    - imgaug==0.4.0
    - tqdm==4.46.0
    - tensorflow-gpu==1.14.0

Unogona kushandisa docker senzvimbo yako yekumhanya, kuti uwane rumwe ruzivo ndapota tarisa kune zvinyorwa.

4. Ngatitange kudzidzira

Isu tinogadzirisa purojekiti uye tinoenda kune dhairekitori reprojekiti:

git clone https://github.com/simbakot/mlflow_example.git
cd mlflow_example/

Kuti umhanye unofanirwa kuisa maraibhurari

pip install mlflow
pip install pysftp

Nokuti mumuenzaniso wandinoshandisa conda_env, Anaconda inofanira kuiswa pakombuta yako (asi iwe unogona kutenderera neizvi nekuisa ese mapeji anodiwa iwe pachako uye kutamba nematanho ekuvhura).

Matanho ese ekugadzirira anopedzwa uye tinogona kutanga kutanga kudzidziswa. Kubva pamudzi weprojekiti:

$ mlflow run -P epochs=10 -P categories=cat,dog -P tracking_uri=http://server_host:server_port .

Mushure mekupinda murairo, nharaunda yeconda inogadzirwa otomatiki uye kudzidziswa kunotanga.
Mumuenzaniso uri pamusoro, ndakapasa nhamba yenguva dzekudzidziswa, iwo mapoka atinoda kuisa chikamu (iwe unogona kuona runyorwa ruzere. pano) uye kero ye server yedu iri kure.
Rondedzero yakazara yemaparamendi anogona kuwanikwa muMLproject faira.

5. Ongorora zvabuda pakudzidza

Mushure mekupedza kudzidziswa, tinogona kuenda mubrowser kune kero yeserver yedu http://server_host:server_port

MLOps - Bika bhuku, chitsauko 1

Pano tinoona runyoro rwezvese zviedzo (kumusoro kuruboshwe), pamwe neruzivo rwekumhanya (pakati). Tinogona kuona rumwe ruzivo rwakadzama (maparamita, metrics, zvigadzirwa uye rumwe ruzivo rwekuwedzera) pakuvhurwa kwega kwega.

MLOps - Bika bhuku, chitsauko 1

Kune yega metric tinogona kuona nhoroondo yekuchinja

MLOps - Bika bhuku, chitsauko 1

Avo. Parizvino, tinogona kuongorora mhedzisiro mu "manual" modhi, uye iwe unogona zvakare kuseta otomatiki kusimbiswa uchishandisa iyo MLflow API.

6. Nyoresa muenzaniso

Mushure mekunge taongorora modhi yedu uye tafunga kuti yakagadzirira kurwa, tinoenderera mberi nekuinyoresa, nekuda kweizvi tinosarudza kutanga kwatinoda (sezvakaratidzwa mundima yapfuura) todzika pasi.

MLOps - Bika bhuku, chitsauko 1

Mushure mekunge tapa muenzaniso wedu zita, ine shanduro. Kana iwe ukachengeta imwe modhi ine zita rimwechete, iyo vhezheni inozongokwidziridzwa.

MLOps - Bika bhuku, chitsauko 1

Kune yega yega modhi, tinogona kuwedzera tsananguro uye tosarudza imwe yematunhu matatu (Staging, Production, Archived); zvino, tichishandisa API, tinokwanisa kuwana idzi nyika, iyo, pamwe neshanduro, inopa imwe shanduko.

MLOps - Bika bhuku, chitsauko 1

Isu tine zvakare nyore kuwana kune ese mamodheru

MLOps - Bika bhuku, chitsauko 1

neshanduro dzavo

MLOps - Bika bhuku, chitsauko 1

Sezviri mundima yapfuura, mabasa ese anogona kuitwa uchishandisa API.

7. Shandisa muenzaniso

Panguva ino, isu tatova neakadzidziswa (keras) modhi. Muenzaniso wekuti ungashandisa sei:

class SegmentationModel:
    def __init__(self, tracking_uri, model_name):

        self.registry = RemoteRegistry(tracking_uri=tracking_uri)
        self.model_name = model_name
        self.model = self.build_model(model_name)

    def get_latest_model(self, model_name):
        registered_models = self.registry.get_registered_model(model_name)
        last_model = self.registry.get_last_model(registered_models)
        local_path = self.registry.download_artifact(last_model.run_id, 'model', './')
        return local_path

    def build_model(self, model_name):
        local_path = self.get_latest_model(model_name)

        return mlflow.keras.load_model(local_path)

    def predict(self, image):
        image = self.preprocess(image)
        result = self.model.predict(image)
        return self.postprocess(result)

    def preprocess(self, image):
        image = cv2.resize(image, (256, 256))
        image = image / 255.
        image = np.expand_dims(image, 0)
        return image

    def postprocess(self, result):
        return result

Pano self.registry zvakare diki kuputira pamusoro mlflow.tracking.MlflowClient, kuti zvireruke. Icho chiripo ndechekuti ini ndinowana kure sevha uye ndinotsvaga modhi ipapo ine zita rakataurwa, uye yazvino vhezheni yekugadzira. Tevere, ndinodhawunirodha chigadzirwa chemunharaunda kune ./model forodha uye ndovaka modhi kubva mudhairekitori iri mlflow.keras.load_model(local_path). Iye zvino tinogona kushandisa modhi yedu. Vagadziri veCV (ML) vanogona kuvandudza modhi nyore nyore uye kuburitsa shanduro nyowani.

Mukupedzisa

Ndakaratidza system inobvumira:

  • nechepakati chengetedza ruzivo nezve ML modhi, kufambira mberi kwekudzidziswa uye mhedzisiro;
  • kurumidza kutumira nzvimbo yekuvandudza;
  • kutarisa uye kuongorora kufambira mberi kwebasa pamienzaniso;
  • zviri nyore kushandura uye kutonga mamiriro emhando;
  • Zviri nyore kuendesa iyo inoguma mamodheru.

Uyu muenzaniso itoyi uye unoshanda senzvimbo yekutanga kuvaka yako sisitimu, iyo inogona kusanganisira otomatiki yekuongorora mhedzisiro uye kunyoreswa kwemamodheru (mapoinzi 5 uye 6, zvichiteerana) kana iwe uchawedzera shanduro yedataset, kana zvimwe chimwe chinhu? Pfungwa yandanga ndichiedza kutaura ndeyekuti unoda MLOps yese, MLflow ingori nzira yekupedza.

Nyora matambudziko api amakasangana nawo andisina kuratidza?
Chii chaungawedzera kune system kuita kuti ikwane zvaunoda?
Ndeapi maturusi uye nzira dzaunoshandisa kugadzirisa ese kana chikamu chematambudziko?

PS Ini ndichasiya akati wandei ma link:
github chirongwa - https://github.com/simbakot/mlflow_example
MLflow - https://mlflow.org/
email yangu yebasa yemibvunzo - [email inodzivirirwa]

Kambani yedu nguva nenguva inogadzira zviitiko zvakasiyana kune nyanzvi dzeIT, semuenzaniso: muna Chikunguru 8 na19:00 Moscow nguva pachava nemusangano weCV mune online fomati, kana uchida, unogona kutora chikamu, kunyoresa. pano .

Source: www.habr.com

Voeg