MLOps - Incwadi Cook, isahluko 1

MLOps - Incwadi Cook, isahluko 1

Sanibonani nonke! Ngingunjiniyela we-CV e-CROC. Sekuphele iminyaka emi-3 sisebenzisa amaphrojekthi emkhakheni we-CV. Ngalesi sikhathi, senze izinto eziningi, isibonelo: sasiqapha abashayeli ukuthi ngenkathi beshayela bangaphuzi, ababhemi, abakhulumi nocingo, babheka umgwaqo, hhayi amaphupho noma amafu. ; Siqophe abantu abashayela emigwaqweni ezinikele futhi bathathe izindawo zokupaka eziningana; waqinisekisa ukuthi abasebenzi bagqoka izigqoko zokuzivikela, amagilavu, njll.; uhlonze umsebenzi ofuna ukungena esikhungweni; Sabala konke esasingakwenza.

Ngenzelani konke lokhu?

Enqubweni yokuqalisa amaphrojekthi, sishaya amaqhuqhuva, amaqhuqhuva amaningi, ezinye zezinkinga ozaziyo noma ozojwayelana nazo esikhathini esizayo.

Ake silingise isimo

Ake sicabange ukuthi sithole umsebenzi enkampanini encane "N", imisebenzi yayo ehlobene ne-ML. Sisebenza kuphrojekthi ye-ML (DL, CV), bese ngenxa yesizathu esithile sishintshela komunye umsebenzi, ngokuvamile sithathe ikhefu, bese sibuyela ku-neuron yethu noma yomunye umuntu.

  1. Umzuzu weqiniso ufika, udinga ukukhumbula ngandlela-thile lapho ume khona, yiziphi ama-hyperparameter owazamile futhi, okubaluleke kakhulu, ukuthi yimiphi imiphumela abaholele kuyo. Kungaba nezinketho eziningi zokuthi ngubani ogcine imininingwane kukho konke ukwethulwa: ekhanda, izilungiselelo, incwajana, endaweni yokusebenza emafini. Ngike ngabona inketho lapho ama-hyperparameter egcinwa njengemigqa ephawuliwe kukhodi, ngokuvamile, indiza yefancy. Manje ake ucabange ukuthi awubuyelanga kuphrojekthi yakho, kodwa kuphrojekthi yomuntu oshiye inkampani futhi uthole njengefa ikhodi nemodeli ebizwa ngokuthi imodeli_1.pb. Ukuze uqedele isithombe futhi udlulise bonke ubuhlungu, ake sicabange ukuthi nawe unguchwepheshe osaqalayo.
  2. Qhubeka. Ukuze sisebenzise ikhodi, thina nawo wonke umuntu esizosebenza nayo kudingeka sakhe indawo. Ngokuvamile kwenzeka ukuthi ngesizathu esithile abazange bamshiye njengefa lethu. Lokhu kungase futhi kube umsebenzi ongasho lutho. Awufuni ukuchitha isikhathi kulesi sinyathelo, akunjalo?
  3. Siqeqesha imodeli (isibonelo, umtshina wemoto). Sifika lapho kuba kuhle kakhulu - sekuyisikhathi sokulondoloza umphumela. Masiyibize nge-car_detection_v1.pb. Bese siqeqesha enye - car_detection_v2.pb. Ngemva kwesikhathi esithile, ozakwethu noma thina ngokwethu sifundisa kakhulu, sisebenzisa izakhiwo ezahlukene. Ngenxa yalokho, kwakhiwa inqwaba yezinto zobuciko, ulwazi okufanele luqoqwe ngokucophelela (kodwa sizokwenza kamuva, ngoba okwamanje sinezindaba ezibalulekile kakhulu).
  4. KULUNGILE sekuphelile Manje! Sinemodeli! Singakwazi yini ukuqala ukuqeqesha imodeli elandelayo, sithuthukise izakhiwo ukuze sixazulule inkinga entsha, noma singahamba siphuze itiye? Futhi ubani ozohambisa?

Ukuhlonza izinkinga

Ukusebenza kuphrojekthi noma umkhiqizo kuwumsebenzi wabantu abaningi. Futhi ngokuhamba kwesikhathi, abantu bayahamba futhi beza, kukhona amaphrojekthi amaningi, futhi amaphrojekthi ngokwawo aba nzima kakhulu. Indlela eyodwa noma enye, izimo ezivela kumjikelezo ochazwe ngenhla (hhayi kuphela) ezinhlanganisela ezithile zizokwenzeka ukusuka ekuphindaphindweni kuya kokuphindwayo. Konke lokhu kubangela ukumosha isikhathi, ukudideka, izinzwa, ukunganeliseki okungenzeka kwamakhasimende, futhi ekugcineni, ukulahlekelwa yimali. Nakuba sonke sivame ukulandela ireki endala efanayo, ngikholwa ukuthi akekho ofuna ukuphinda aphile lezi zikhathi ngokuphindaphindiwe.

MLOps - Incwadi Cook, isahluko 1

Ngakho-ke, sesidlule emjikelezweni owodwa wentuthuko futhi siyabona ukuthi kunezinkinga ezidinga ukuxazululwa. Ukuze wenze lokhu udinga:

  • gcina kahle imiphumela yomsebenzi;
  • yenza inqubo yokubandakanya abasebenzi abasha ibe lula;
  • ukwenza kube lula inqubo yokuthumela indawo yokuthuthukisa;
  • lungiselela inqubo yenguqulo yemodeli;
  • ube nendlela elula yokuqinisekisa amamodeli;
  • thola ithuluzi lokuphatha lombuso oyimodeli;
  • thola indlela yokuletha amamodeli ekukhiqizeni.

Ngokusobala kuyadingeka ukuthi uqhamuke nokuhamba komsebenzi okungakuvumela ukuthi uphathe kalula futhi kalula lo mjikelezo wempilo? Lo mkhuba ubizwa ngokuthi ama-MLOps

I-MLOps, noma i-DevOps yokufunda komshini, ivumela isayensi yedatha namathimba e-IT ukuthi asebenzisane futhi akhuphule ijubane lokuthuthukiswa kwemodeli nokusetshenziswa ngokuqapha, ukuqinisekiswa, kanye nokuphatha amamodeli okufunda omshini.

Ungakwazi fundaBacabangani abafana bakwaGoogle ngakho konke lokhu? Kusuka esihlokweni kuyacaca ukuthi ama-MLOps ayinto enkulu.

MLOps - Incwadi Cook, isahluko 1

Ngokuqhubekayo esihlokweni sami ngizochaza ingxenye kuphela yenqubo. Ukuze ngisebenzise, ​​ngizosebenzisa ithuluzi le-MLflow, ngoba... Lena iphrojekthi yomthombo ovulekile, inani elincane lekhodi liyadingeka ukuze kuxhunywe futhi kukhona ukuhlanganiswa nezinhlaka ezidumile ze-ml. Ungasesha ku-inthanethi amanye amathuluzi, njenge-Kubeflow, SageMaker, Trains, njll., futhi mhlawumbe uthole elivumelana kangcono nezidingo zakho.

"Ukwakha" ama-MLOps kusetshenziswa isibonelo sokusebenzisa ithuluzi le-MLFlow

I-MLFlow iyinkundla yomthombo ovulekile yokuphathwa kwe-lifecycle yamamodeli we-ml (https://mlflow.org/).

I-MLflow ihlanganisa izingxenye ezine:

  • I-MLflow Tracking - ihlanganisa izindaba zokuqoshwa kwemiphumela namapharamitha aholele kulo mphumela;
  • I-MLflow Project - ikuvumela ukuthi upakishe ikhodi futhi uyikhiqize kabusha kunoma iyiphi inkundla;
  • I-MLflow Models - enesibopho sokuthumela amamodeli ekukhiqizeni;
  • I-MLflow Registry - ikuvumela ukuthi ugcine amamodeli futhi uphathe isimo sawo endaweni emaphakathi.

I-MLflow isebenza ezinhlakeni ezimbili:

  • ukwethulwa kuwumjikelezo ogcwele wokuqeqeshwa, amapharamitha kanye namamethrikhi esifuna ukubhalisa ngawo;
  • Ukuhlola "isihloko" esisebenza ndawonye.

Zonke izinyathelo zesibonelo zenziwa ohlelweni lokusebenza lwe-Ubuntu 18.04.

1. Hambisa iseva

Ukuze sikwazi ukuphatha kalula iphrojekthi yethu futhi sithole lonke ulwazi oludingekayo, sizothumela iseva. Iseva yokulandelela i-MLflow inezingxenye ezimbili eziyinhloko:

  • i-backend store - enesibopho sokugcina ulwazi mayelana namamodeli abhalisiwe (isekela ama-DBMS angu-4: i-mysql, i-mssql, i-sqlite, ne-postgresql);
  • isitolo sobuciko - sinesibopho sokugcina izinto zobuciko (isekela izinketho zokugcina ezingu-7: i-Amazon S3, i-Azure Blob Storage, i-Google Cloud Storage, iseva ye-FTP, iseva ye-SFTP, i-NFS, i-HDFS).

Ngekhwalithi isitolo sobuciko Ukwenza kube lula, ake sithathe iseva ye-sftp.

  • dala iqembu
    $ sudo groupadd sftpg
  • engeza umsebenzisi futhi usethe iphasiwedi yakhe
    $ sudo useradd -g sftpg mlflowsftp
    $ sudo passwd mlflowsftp 
  • ukulungisa izilungiselelo ezimbalwa zokufinyelela
    $ sudo mkdir -p /data/mlflowsftp/upload
    $ sudo chown -R root.sftpg /data/mlflowsftp
    $ sudo chown -R mlflowsftp.sftpg /data/mlflowsftp/upload
  • engeza imigqa embalwa ku-/etc/ssh/sshd_config
    Match Group sftpg
     ChrootDirectory /data/%u
     ForceCommand internal-sftp
  • qala kabusha isevisi
    $ sudo systemctl restart sshd

Ngekhwalithi isitolo esingemuva Ake sithathe i-postgresql.

$ sudo apt update
$ sudo apt-get install -y postgresql postgresql-contrib postgresql-server-dev-all
$ sudo apt install gcc
$ pip install psycopg2
$ sudo -u postgres -i
# Create new user: mlflow_user
[postgres@user_name~]$ createuser --interactive -P
Enter name of role to add: mlflow_user
Enter password for new role: mlflow
Enter it again: mlflow
Shall the new role be a superuser? (y/n) n
Shall the new role be allowed to create databases? (y/n) n
Shall the new role be allowed to create more new roles? (y/n) n
# Create database mlflow_bd owned by mlflow_user
$ createdb -O mlflow_user mlflow_db

Ukuqala iseva, udinga ukufaka amaphakheji e-python alandelayo (ngincoma ukudala indawo ehlukile ebonakalayo):

pip install mlflow
pip install pysftp

Asiqale iseva yethu

$ mlflow server  
                 --backend-store-uri postgresql://mlflow_user:mlflow@localhost/mlflow_db 
                 --default-artifact-root sftp://mlflowsftp:mlflow@sftp_host/upload  
                --host server_host 
                --port server_port

2. Engeza ukulandelela

Ukuze imiphumela yokuqeqeshwa kwethu ingalahleki, izizukulwane ezizayo zonjiniyela ziqonde ukuthi kwenzekani, futhi ukuze amaqabane amadala kanye nawe nikwazi ukuhlaziya ngokuzolile inqubo yokufunda, sidinga ukwengeza ukulandelela. Ukulandelela kusho ukulondoloza amapharamitha, amamethrikhi, ama-artifact nanoma yiluphi ulwazi olwengeziwe mayelana nokuqala kokuqeqeshwa, esimweni sethu, kuseva.

Ngokwesibonelo, ngakha encane iphrojekthi ku-github ku-Keras ngokuhlukanisa yonke into engaphakathi Idatha ye-COCO. Ukwengeza ukulandelela, ngidale ifayela mlflow_training.py.

Nansi imigqa lapho izinto ezithakazelisa kakhulu zenzeka khona:

def run(self, epochs, lr, experiment_name):
        # getting the id of the experiment, creating an experiment in its absence
        remote_experiment_id = self.remote_server.get_experiment_id(name=experiment_name)
        # creating a "run" and getting its id
        remote_run_id = self.remote_server.get_run_id(remote_experiment_id)

        # indicate that we want to save the results on a remote server
        mlflow.set_tracking_uri(self.tracking_uri)
        mlflow.set_experiment(experiment_name)

        with mlflow.start_run(run_id=remote_run_id, nested=False):
            mlflow.keras.autolog()
            self.train_pipeline.train(lr=lr, epochs=epochs)

        try:
            self.log_tags_and_params(remote_run_id)
        except mlflow.exceptions.RestException as e:
            print(e)

Lapha i-self.remote_server igoqa elincane phezu kwezindlela zokulandelela ze-mlflow. I-MlflowClient (ngiyenzele ukuthi kube lula), ngosizo engidala ngalo isilingo futhi ngiyisebenzise kuseva. Okulandelayo, ngikhomba lapho imiphumela yokuqalisa kufanele ihlanganiswe (mlflow.set_tracking_uri(self.tracking_uri)). Ngivumela ukugawulwa kwemithi okuzenzakalelayo mlflow.keras.autolog(). Okwamanje i-MLflow Tracking isekela ukugawulwa kwemithi okuzenzakalelayo kwe-TensorFlow, Keras, Gluon XGBoost, LightGBM, Spark. Uma ungakatholi uhlaka lwakho noma umtapo wolwazi, ungakwazi njalo ukungena ngokusobala. Siyaqala ukuqeqeshwa. Bhalisa amathegi kanye nemingcele yokufaka kuseva eyirimothi.

Imigqa embalwa futhi wena, njengawo wonke umuntu, unokufinyelela olwazini mayelana nakho konke ukuqaliswa. Kupholile?

3. Sidweba iphrojekthi

Manje masenze kube lula ukwethula iphrojekthi. Ukwenza lokhu, engeza ifayela le-MLproject kanye ne-conda.yaml kumsuka wephrojekthi.
MLproject

name: flow_segmentation
conda_env: conda.yaml

entry_points:
  main:
    parameters:
        categories: {help: 'list of categories from coco dataset'}
        epochs: {type: int, help: 'number of epochs in training'}

        lr: {type: float, default: 0.001, help: 'learning rate'}
        batch_size: {type: int, default: 8}
        model_name: {type: str, default: 'Unet', help: 'Unet, PSPNet, Linknet, FPN'}
        backbone_name: {type: str, default: 'resnet18', help: 'exampe resnet18, resnet50, mobilenetv2 ...'}

        tracking_uri: {type: str, help: 'the server address'}
        experiment_name: {type: str, default: 'My_experiment', help: 'remote and local experiment name'}
    command: "python mlflow_training.py 
            --epochs={epochs}
            --categories={categories}
            --lr={lr}
            --tracking_uri={tracking_uri}
            --model_name={model_name}
            --backbone_name={backbone_name}
            --batch_size={batch_size}
            --experiment_name={experiment_name}"

I-MLflow Project inezindawo ezimbalwa:

  • Igama - igama lephrojekthi yakho;
  • Imvelo - esimweni sami, i-conda_env ibonisa ukuthi i-Anaconda isetshenziselwa ukusebenza futhi incazelo yokuncika ikufayela le-conda.yaml;
  • Amaphuzu Okungena - akhombisa ukuthi yimaphi amafayela nokuthi yimiphi imingcele esingayisebenzisa (yonke imingcele ingena ngokuzenzakalelayo lapho iqala ukuqeqeshwa)

conda.yaml

name: flow_segmentation
channels:
  - defaults
  - anaconda
dependencies:
  - python==3.7
  - pip:
    - mlflow==1.8.0
    - pysftp==0.2.9
    - Cython==0.29.19
    - numpy==1.18.4
    - pycocotools==2.0.0
    - requests==2.23.0
    - matplotlib==3.2.1
    - segmentation-models==1.0.1
    - Keras==2.3.1
    - imgaug==0.4.0
    - tqdm==4.46.0
    - tensorflow-gpu==1.14.0

Ungasebenzisa i-docker njengendawo yakho yesikhathi sokusebenza, ukuze uthole imininingwane eyengeziwe sicela ubheke kuyo imibhalo.

4. Ake siqale ukuqeqeshwa

Sihlanganisa iphrojekthi bese siya kuhla lwemibhalo yephrojekthi:

git clone https://github.com/simbakot/mlflow_example.git
cd mlflow_example/

Ukuze usebenze udinga ukufaka amalabhulali

pip install mlflow
pip install pysftp

Ngoba esibonelweni engisebenzisa i-conda_env, i-Anaconda kufanele ifakwe kukhompyutha yakho (kodwa ungakwazi ukuzungeza lokhu ngokufaka wonke amaphakheji adingekayo ngokwakho futhi udlale nemingcele yokuqalisa).

Zonke izinyathelo zokulungiselela seziqediwe futhi singaqala ukwethula ukuqeqeshwa. Kusuka kumsuka wephrojekthi:

$ mlflow run -P epochs=10 -P categories=cat,dog -P tracking_uri=http://server_host:server_port .

Ngemva kokufaka umyalo, indawo ye-conda izokwenziwa ngokuzenzakalelayo futhi ukuqeqeshwa kuzoqala.
Esibonelweni esingenhla, ngidlulise inani lezinkathi zokuqeqeshwa, izigaba esifuna ukuhlukanisa kuzo (ungabona uhlu olugcwele lapha) kanye nekheli leseva yethu ekude.
Uhlu oluphelele lwamapharamitha angenzeka lungatholakala efayeleni le-MLproject.

5. Linganisa imiphumela yokufunda

Ngemva kokuqeda ukuqeqeshwa, singangena kusiphequluli siye ekhelini leseva yethu http://server_host:server_port

MLOps - Incwadi Cook, isahluko 1

Lapha sibona uhlu lwazo zonke izivivinyo (phezulu kwesokunxele), kanye nolwazi ngokugijima (maphakathi). Singabuka ulwazi oluningiliziwe (amapharamitha, amamethrikhi, ama-artifact nolunye ulwazi olwengeziwe) ekuqalisweni ngakunye.

MLOps - Incwadi Cook, isahluko 1

Kumethrikhi ngayinye singabuka umlando wezinguquko

MLOps - Incwadi Cook, isahluko 1

Labo. Okwamanje, singakwazi ukuhlaziya imiphumela ngemodi "yemanuwali", futhi ungasetha ukuqinisekiswa okuzenzakalelayo usebenzisa i-MLflow API.

6. Bhalisa imodeli

Ngemva kokuba sesihlaziye imodeli yethu futhi sanquma ukuthi isilungele impi, siyaqhubeka siyibhalisa, ngoba lokhu sikhetha ukwethulwa esikudingayo (njengoba kuboniswe endimeni edlule) bese sehla.

MLOps - Incwadi Cook, isahluko 1

Ngemva kokunikeza imodeli yethu igama, iba nenguqulo. Uma ulondoloza enye imodeli enegama elifanayo, inguqulo izothuthukiswa ngokuzenzakalelayo.

MLOps - Incwadi Cook, isahluko 1

Kumodeli ngayinye, singangeza incazelo bese sikhetha isifunda esisodwa kwezintathu (Isiteji, Ukukhiqiza, Okufakwe Kungobo Yomlando); ngokulandelayo, sisebenzisa i-API, singakwazi ukufinyelela lezi zifundazwe, okuthi, kanye nokwenza inguqulo, kunikeze ukuguquguquka okwengeziwe.

MLOps - Incwadi Cook, isahluko 1

Siphinde sibe nokufinyelela okulula kuwo wonke amamodeli

MLOps - Incwadi Cook, isahluko 1

nezinguqulo zabo

MLOps - Incwadi Cook, isahluko 1

Njengasesigabeni esidlule, yonke imisebenzi ingenziwa kusetshenziswa i-API.

7. Hambisa imodeli

Kulesi sigaba, sesivele sinemodeli eqeqeshiwe (ama-keras). Isibonelo sokuthi ungayisebenzisa kanjani:

class SegmentationModel:
    def __init__(self, tracking_uri, model_name):

        self.registry = RemoteRegistry(tracking_uri=tracking_uri)
        self.model_name = model_name
        self.model = self.build_model(model_name)

    def get_latest_model(self, model_name):
        registered_models = self.registry.get_registered_model(model_name)
        last_model = self.registry.get_last_model(registered_models)
        local_path = self.registry.download_artifact(last_model.run_id, 'model', './')
        return local_path

    def build_model(self, model_name):
        local_path = self.get_latest_model(model_name)

        return mlflow.keras.load_model(local_path)

    def predict(self, image):
        image = self.preprocess(image)
        result = self.model.predict(image)
        return self.postprocess(result)

    def preprocess(self, image):
        image = cv2.resize(image, (256, 256))
        image = image / 255.
        image = np.expand_dims(image, 0)
        return image

    def postprocess(self, result):
        return result

Lapha i-self.registry iphinde ibe isisonga esincane phezu kwe-mlflow.tracking.MlflowClient, ukuze kube lula. Iphuzu ukuthi ngifinyelela iseva ekude futhi ngibheke imodeli lapho enegama elishiwo, kanye nenguqulo yakamuva yokukhiqiza. Okulandelayo, ngilanda i-artifact endaweni kufolda ./model futhi ngakhe imodeli kusukela kulolu hlu lwemibhalo mlflow.keras.load_model(local_path). Manje singasebenzisa imodeli yethu. Abathuthukisi be-CV (ML) bangathuthukisa kalula imodeli futhi bashicilele izinguqulo ezintsha.

Ekuphethweni

Ngethule isistimu evumela:

  • gcina imininingwane emayelana namamodeli e-ML, inqubekelaphambili yokuqeqeshwa kanye nemiphumela;
  • thumela ngokushesha indawo yokuthuthukisa;
  • qapha futhi uhlaziye inqubekelaphambili yomsebenzi kumamodeli;
  • kulula ukwenza inguqulo nokuphatha isimo samamodeli;
  • Kulula ukuhambisa amamodeli avelayo.

Lesi sibonelo siyithoyizi futhi sisebenza njengesiqalo sokwakha isistimu yakho, engahlanganisa ukuzenzekelayo kokuhlolwa kwemiphumela nokubhaliswa kwamamodeli (amaphuzu 5 no-6, ngokulandelana) noma uzongeza ukuguqulwa kwamadathasethi, noma mhlawumbe enye into ? Iphuzu ebengizama ukuliveza ukuthi udinga ama-MLOps wonkana, i-MLflow iyindlela nje yokuphela.

Bhala ukuthi yiziphi izinkinga ohlangabezane nazo engingazibonisanga?
Yini ongayengeza ohlelweni ukuze ulenze lihlangabezane nezidingo zakho?
Imaphi amathuluzi nezindlela ozisebenzisayo ukuxazulula zonke noma ingxenye yezinkinga?

PS ngizoshiya izixhumanisi ezimbalwa:
iphrojekthi ye-github - https://github.com/simbakot/mlflow_example
I-MLflow - https://mlflow.org/
I-imeyili yami yomsebenzi yemibuzo - [i-imeyili ivikelwe]

Inkampani yethu ngezikhathi ezithile ibamba imicimbi ehlukahlukene yochwepheshe be-IT, isibonelo: ngoJulayi 8 ngo-19:00 isikhathi saseMoscow kuzoba nomhlangano we-CV ngefomethi ye-inthanethi, uma unentshisekelo, ungabamba iqhaza, ubhalise. lapha .

Source: www.habr.com

Engeza amazwana