Mokhoa oa ho iketsetsa autoscaler bakeng sa sehlopha

Lumela! Re koetlisa batho ho sebetsa ka data e kholo. Ho ke ke ha khoneha ho nahana ka lenaneo la thuto ho data e kholo ntle le sehlopha sa eona, moo barupeluoa bohle ba sebetsang hammoho. Ka lebaka lena, lenaneo la rona le lula le e-na le eona 🙂 Re sebetsana le tlhophiso ea eona, tokiso le tsamaiso, 'me bahlankana ba qala ka ho toba MapReduce mesebetsi moo 'me ba sebelisa Spark.

Ka poso ena re tla u bolella kamoo re rarolotseng bothata ba ho kenya lihlopha tse sa tšoaneng ka ho ngola autoscaler ea rona re sebelisa leru. Mail.ru Cloud Solutions.

bothata

Sehlopha sa rona ha se sebelisoe ka mokhoa o tloaelehileng. Ho lahloa ha hoa lekana haholo. Ka mohlala, ho na le litlelase tse sebetsang, ha batho bohle ba 30 le mosuoe ba ea sehlopheng sa lihlopha ebe ba qala ho se sebelisa. Kapa hape, ho na le matsatsi pele ho nako ea ho qetela ha mojaro o eketseha haholo. Nako e setseng sehlopha se sebetsa ka mokhoa oa underload.

Tharollo #1 ke ho boloka sehlopha se tla mamella meroalo e phahameng, empa e tla lula e sa sebetse nako eohle.

Tharollo #2 ke ho boloka sehlopha se senyenyane, seo ka bowena u eketsang li-node pele ho litlelase le nakong ea meroalo e phahameng.

Tharollo #3 ke ho boloka sehlopha se senyenyane le ho ngola autoscaler e tla shebella mojaro oa hona joale oa sehlopha 'me, ho sebelisa li-API tse sa tšoaneng, ho eketsa le ho tlosa li-node ho tloha sehlopheng.

Ka poso ena re tla bua ka tharollo #3. Autoscaler ena e itšetlehile haholo ka lintlha tsa ka ntle ho e-na le tse ka hare, 'me hangata bafani ha ba fane ka eona. Re sebelisa lisebelisoa tsa maru tsa Mail.ru Cloud Solutions mme re ngotse autoscaler re sebelisa MCS API. Mme kaha re ruta mokhoa oa ho sebetsa ka data, re nkile qeto ea ho bonts'a hore na u ka ngola autoscaler e ts'oanang joang molemong oa hau mme u e sebelise ka leru la hau.

batlehang

Pele, o tlameha ho ba le sehlopha sa Hadoop. Ka mohlala, re sebelisa kabo ea HDP.

E le hore li-node tsa hau li kenyelletsoe ka potlako 'me li tlosoe, u tlameha ho ba le kabo e itseng ea mesebetsi har'a li-node.

  1. Master node. Ha ho hlokahale hore u hlalose ntho leha e le efe ka ho khetheha: node e ka sehloohong ea sehlopha, eo ka mohlala, mokhanni oa Spark a qalisoang, haeba u sebelisa mokhoa o kopanetsoeng.
  2. Node ea letsatsi. Ena ke node eo ho eona u bolokang data ho HDFS le moo lipalo li etsahalang.
  3. Node ea komporo. Ena ke node moo o sa bolokeng letho ho HDFS, empa moo lipalo li etsahalang.

Ntlha ea bohlokoa. Autoscaling e tla etsahala ka lebaka la li-node tsa mofuta oa boraro. Haeba u qala ho nka le ho eketsa li-node tsa mofuta oa bobeli, lebelo la ho arabela le tla ba tlase haholo - ho tlosa le ho khutlisa ho tla nka lihora tse ngata sehlopheng sa hau. Sena, ehlile, ha se seo u se lebelletseng ho tsoa ho autoscaling. Ke hore, ha re ame nodes ea mofuta oa pele le oa bobeli. Li tla emela sehlopha sa bonyane se sebetsang se tla ba teng nakong eohle ea lenaneo.

Kahoo, autoscaler ea rona e ngotsoe ho Python 3, e sebelisa Ambari API ho laola litšebeletso tsa lihlopha, ho sebelisa API ho tsoa ho Mail.ru Cloud Solutions (MCS) bakeng sa ho qala le ho emisa mechini.

Mehaho ea tharollo

  1. Module autoscaler.py. E na le lihlopha tse tharo: 1) mesebetsi ea ho sebetsa le Ambari, 2) mesebetsi ea ho sebetsa le MCS, 3) mesebetsi e amanang ka ho toba le logic ea autoscaler.
  2. Script observer.py. Ha e le hantle e na le melao e fapaneng: neng le ka nako efe ho bitsa mesebetsi ea autoscaler.
  3. Faele ea tlhophiso config.py. E na le, ka mohlala, lethathamo la li-node tse lumelletsoeng bakeng sa autoscaling le li-parameter tse ling tse amang, mohlala, nako e kae ea ho ema ho tloha ha node e ncha e kenngoa. Ho boetse ho na le litempe tsa linako tsa ho qala litlelase, e le hore pele ho sehlopha ho qalisoa tlhophiso e phahameng e lumelletsoeng ea sehlopha.

Ha re shebeng likotoana tsa khoutu ka har'a lifaele tse peli tsa pele.

1. Autoscaler.py module

Sehlopha sa Ambari

Sena ke seo karolo ea khoutu e nang le sehlopha se shebahalang ka eona Ambari:

class Ambari:
    def __init__(self, ambari_url, cluster_name, headers, auth):
        self.ambari_url = ambari_url
        self.cluster_name = cluster_name
        self.headers = headers
        self.auth = auth

    def stop_all_services(self, hostname):
        url = self.ambari_url + self.cluster_name + '/hosts/' + hostname + '/host_components/'
        url2 = self.ambari_url + self.cluster_name + '/hosts/' + hostname
        req0 = requests.get(url2, headers=self.headers, auth=self.auth)
        services = req0.json()['host_components']
        services_list = list(map(lambda x: x['HostRoles']['component_name'], services))
        data = {
            "RequestInfo": {
                "context":"Stop All Host Components",
                "operation_level": {
                    "level":"HOST",
                    "cluster_name": self.cluster_name,
                    "host_names": hostname
                },
                "query":"HostRoles/component_name.in({0})".format(",".join(services_list))
            },
            "Body": {
                "HostRoles": {
                    "state":"INSTALLED"
                }
            }
        }
        req = requests.put(url, data=json.dumps(data), headers=self.headers, auth=self.auth)
        if req.status_code in [200, 201, 202]:
            message = 'Request accepted'
        else:
            message = req.status_code
        return message

Ka holimo, e le mohlala, u ka sheba ts'ebetsong ea mosebetsi stop_all_services, e emisang lits'ebeletso tsohle sebakeng se lakatsehang sa cluster.

Monyako oa tlelase Ambari u feta:

  • ambari_url, mohlala, joalo ka 'http://localhost:8080/api/v1/clusters/',
  • cluster_name - lebitso la sehlopha sa hau ho Ambari,
  • headers = {'X-Requested-By': 'ambari'}
  • le ka hare auth Mona ke lebitso la hau la mosebelisi le password bakeng sa Ambari: auth = ('login', 'password').

Ts'ebetso ka boeona ha se letho haese mehala e 'maloa feela ka REST API ho Ambari. Ho ea ka pono e utloahalang, re qala ho fumana lethathamo la litšebeletso tse sebetsang sebakeng sa node, ebe re botsa sehlopheng se fanoeng, sebakeng se fanoeng, ho fetisetsa litšebeletso ho tloha lethathamong ho ea ho naha. INSTALLED. Mesebetsi ea ho qala lits'ebeletso tsohle, bakeng sa ho fetisetsa li-node ho state Maintenance joalo-joalo li shebahala li tšoana - ke likopo tse 'maloa feela ka API.

Sehlopha sa Mcs

Sena ke seo karolo ea khoutu e nang le sehlopha se shebahalang ka eona Mcs:

class Mcs:
    def __init__(self, id1, id2, password):
        self.id1 = id1
        self.id2 = id2
        self.password = password
        self.mcs_host = 'https://infra.mail.ru:8774/v2.1'

    def vm_turn_on(self, hostname):
        self.token = self.get_mcs_token()
        host = self.hostname_to_vmname(hostname)
        vm_id = self.get_vm_id(host)
        mcs_url1 = self.mcs_host + '/servers/' + self.vm_id + '/action'
        headers = {
            'X-Auth-Token': '{0}'.format(self.token),
            'Content-Type': 'application/json'
        }
        data = {'os-start' : 'null'}
        mcs = requests.post(mcs_url1, data=json.dumps(data), headers=headers)
        return mcs.status_code

Monyako oa tlelase Mcs re fetisa id ea projeke ka har'a leru le ID ea mosebelisi, hammoho le password ea hae. Ka tshebetso vm_turn_on re batla ho bulela o mong oa mechini. Monahano o rarahane haholoanyane. Qalong ea khoutu, mesebetsi e meng e meraro e bitsoa: 1) re hloka ho fumana letšoao, 2) re hloka ho fetolela lebitso la moeti ka lebitso la mochine ho MCS, 3) fumana id ea mochine ona. Ka mor'a moo, re etsa kopo ea poso ebe re qala mochine ona.

Sena ke seo mosebetsi oa ho fumana token o shebahalang ka sona:

def get_mcs_token(self):
        url = 'https://infra.mail.ru:35357/v3/auth/tokens?nocatalog'
        headers = {'Content-Type': 'application/json'}
        data = {
            'auth': {
                'identity': {
                    'methods': ['password'],
                    'password': {
                        'user': {
                            'id': self.id1,
                            'password': self.password
                        }
                    }
                },
                'scope': {
                    'project': {
                        'id': self.id2
                    }
                }
            }
        }
        params = (('nocatalog', ''),)
        req = requests.post(url, data=json.dumps(data), headers=headers, params=params)
        self.token = req.headers['X-Subject-Token']
        return self.token

Sehlopha sa Autoscaler

Sehlopha sena se na le mesebetsi e amanang le logic ea ts'ebetso ka boeona.

Sena ke seo karolo ea khoutu ea sehlopha sena e shebahalang ka eona:

class Autoscaler:
    def __init__(self, ambari, mcs, scaling_hosts, yarn_ram_per_node, yarn_cpu_per_node):
        self.scaling_hosts = scaling_hosts
        self.ambari = ambari
        self.mcs = mcs
        self.q_ram = deque()
        self.q_cpu = deque()
        self.num = 0
        self.yarn_ram_per_node = yarn_ram_per_node
        self.yarn_cpu_per_node = yarn_cpu_per_node

    def scale_down(self, hostname):
        flag1 = flag2 = flag3 = flag4 = flag5 = False
        if hostname in self.scaling_hosts:
            while True:
                time.sleep(5)
                status1 = self.ambari.decommission_nodemanager(hostname)
                if status1 == 'Request accepted' or status1 == 500:
                    flag1 = True
                    logging.info('Decomission request accepted: {0}'.format(flag1))
                    break
            while True:
                time.sleep(5)
                status3 = self.ambari.check_service(hostname, 'NODEMANAGER')
                if status3 == 'INSTALLED':
                    flag3 = True
                    logging.info('Nodemaneger decommissioned: {0}'.format(flag3))
                    break
            while True:
                time.sleep(5)
                status2 = self.ambari.maintenance_on(hostname)
                if status2 == 'Request accepted' or status2 == 500:
                    flag2 = True
                    logging.info('Maintenance request accepted: {0}'.format(flag2))
                    break
            while True:
                time.sleep(5)
                status4 = self.ambari.check_maintenance(hostname, 'NODEMANAGER')
                if status4 == 'ON' or status4 == 'IMPLIED_FROM_HOST':
                    flag4 = True
                    self.ambari.stop_all_services(hostname)
                    logging.info('Maintenance is on: {0}'.format(flag4))
                    logging.info('Stopping services')
                    break
            time.sleep(90)
            status5 = self.mcs.vm_turn_off(hostname)
            while True:
                time.sleep(5)
                status5 = self.mcs.get_vm_info(hostname)['server']['status']
                if status5 == 'SHUTOFF':
                    flag5 = True
                    logging.info('VM is turned off: {0}'.format(flag5))
                    break
            if flag1 and flag2 and flag3 and flag4 and flag5:
                message = 'Success'
                logging.info('Scale-down finished')
                logging.info('Cooldown period has started. Wait for several minutes')
        return message

Re amohela litlelase bakeng sa ho kena. Ambari и Mcs, lethathamo la li-node tse lumelloang bakeng sa ho lekanya, hammoho le li-parameter tsa tlhophiso ea li-node: memori le cpu e abetsoeng node ho YARN. Ho boetse ho na le li-parameter tse 2 tsa ka hare q_ram, q_cpu, e leng mela. Re li sebelisa, re boloka boleng ba mojaro oa sehlopha sa hajoale. Haeba re bona hore metsotsong e 5 e fetileng ho bile le mojaro o ntseng o eketseha, joale re etsa qeto ea hore re hloka ho eketsa node ea +1 sehlopheng. Ho joalo le ka boemo ba cluster underutilization.

Khoutu e ka holimo ke mohlala oa ts'ebetso e tlosang mochini ho tsoa sehlopheng ebe o e emisa ka har'a leru. Ntlha ea pele, ho na le ho tlosoa YARN Nodemanager, ebe mokhoa oa bulela Maintenance, ebe re emisa lits'ebeletso tsohle mochining ebe re tima mochini o sebetsang marung.

2. Script monitorer.py

Mohlala oa khoutu ho tloha moo:

if scaler.assert_up(config.scale_up_thresholds) == True:
        hostname = cloud.get_vm_to_up(config.scaling_hosts)
        if hostname != None:
            status1 = scaler.scale_up(hostname)
            if status1 == 'Success':
                text = {"text": "{0} has been successfully scaled-up".format(hostname)}
                post = {"text": "{0}".format(text)}
                json_data = json.dumps(post)
                req = requests.post(webhook, data=json_data.encode('ascii'), headers={'Content-Type': 'application/json'})
                time.sleep(config.cooldown_period*60)

Ho eona, re hlahloba hore na maemo a entsoe bakeng sa ho eketsa bokhoni ba sehlopha le hore na ho na le mechini e bolokiloeng, fumana lebitso la moeti oa e 'ngoe ea tsona, re e kenye sehlopheng ebe re phatlalatsa molaetsa ka eona ho Slack ea sehlopha sa rona. Ka mor'a moo e qala cooldown_period, ha re sa eketse kapa re sa tlose letho ho tsoa sehlopheng, empa re lekola mojaro feela. Haeba e tsitsitse 'me e le ka har'a phasejete ea boleng bo holimo ba thepa, joale re tsoela pele ho beha leihlo. Haeba node e le 'ngoe e ne e sa lekana, joale re eketsa e' ngoe.

Bakeng sa linyeoe ha re e-na le thuto e tlang pele, re se re ntse re tseba hantle hore node e le 'ngoe e ke ke ea lekana, kahoo hang-hang re qala li-node tsohle tsa mahala ebe re li boloka li le mafolofolo ho fihlela qetellong ea thuto. Sena se etsahala ho sebelisoa lethathamo la litempe tsa nako tsa ts'ebetso.

fihlela qeto e

Autoscaler ke tharollo e ntle le e bonolo bakeng sa linyeoe tseo ha u e-na le ho kenya lihlopha tse sa tšoaneng. Ka nako e ts'oanang u fihlela tlhophiso e lakatsehang ea lihlopha bakeng sa meroalo e phahameng haholo 'me ka nako e ts'oanang u se ke ua boloka sehlopha sena nakong ea thepa e tlase, ho boloka chelete. Hape, sena sohle se etsahala ka bohona ntle le ho nka karolo ha hau. Autoscaler ka boeona ha se letho ho feta sehlopha sa likopo ho mookameli oa lihlopha API le API ea mofani oa leru, e ngotsoeng ho latela mohopolo o itseng. Seo u hlileng u hlokang ho se hopola ke karohano ea li-node ka mefuta e 3, joalo ka ha re ngotse pejana. Mme o tla thaba.

Source: www.habr.com

Eketsa ka tlhaloso