Momwe mungapangire nokha autoscaler pagulu

Moni! Timaphunzitsa anthu kuti azigwira ntchito ndi data yayikulu. Ndizosatheka kulingalira pulogalamu yophunzitsa pa data yayikulu popanda gulu lake, pomwe otenga nawo mbali onse amagwirira ntchito limodzi. Pazifukwa izi, pulogalamu yathu imakhala nayo nthawi zonse πŸ™‚ Tikugwira ntchito yokonza, kukonza ndi kuyang'anira, ndipo anyamatawo amatsegula mwachindunji MapReduce ntchito kumeneko ndikugwiritsa ntchito Spark.

Mu positi iyi tikuuzani momwe tidathetsera vuto la kutsitsa kwamagulu osagwirizana polemba autoscaler yathu pogwiritsa ntchito mtambo. Mail.ru Cloud Solutions.

vuto

Gulu lathu silinagwiritsidwe ntchito mwanjira yofananira. Kutaya sikufanana kwambiri. Mwachitsanzo, pali makalasi ogwira ntchito, pamene anthu onse 30 ndi mphunzitsi amapita kumagulu ndikuyamba kugwiritsira ntchito. Kapenanso, pali masiku angapo tsiku lomaliza lisanafike pamene katunduyo akuwonjezeka kwambiri. Nthawi zina gulu limagwira ntchito motsitsa.

Yankho #1 ndikusunga gulu lomwe lingapirire katundu wambiri, koma lidzakhala lopanda ntchito nthawi yonseyi.

Yankho #2 ndikusunga kagulu kakang'ono, komwe mumawonjezera ma node musanayambe makalasi komanso panthawi yolemetsa kwambiri.

Yankho #3 ndikusunga kagulu kakang'ono ndikulemba autoscaler yomwe idzayang'anire katundu wamakono wa masango ndipo, pogwiritsa ntchito ma API osiyanasiyana, onjezerani ndi kuchotsa node kuchokera kumagulu.

Mu positi iyi tikambirana za yankho #3. Autoscaler iyi imadalira kwambiri zinthu zakunja osati zamkati, ndipo opereka nthawi zambiri samapereka. Timagwiritsa ntchito makina amtambo a Mail.ru Cloud Solutions ndikulemba autoscaler pogwiritsa ntchito MCS API. Ndipo popeza timaphunzitsa momwe tingagwiritsire ntchito deta, tinaganiza zosonyeza momwe mungalembere autoscaler yofanana ndi zolinga zanu ndikugwiritsa ntchito ndi mtambo wanu.

Zofunikira

Choyamba, muyenera kukhala ndi gulu la Hadoop. Mwachitsanzo, timagwiritsa ntchito kugawa kwa HDP.

Kuti node zanu ziwonjezedwe mwachangu ndikuchotsedwa, muyenera kukhala ndi magawo ena a maudindo pakati pa node.

  1. Master node. Chabwino, palibe chifukwa chofotokozera chirichonse makamaka: mfundo yaikulu ya masango, yomwe, mwachitsanzo, dalaivala wa Spark amayambitsidwa, ngati mumagwiritsa ntchito njira yolumikizirana.
  2. Node ya tsiku. Iyi ndiye mfundo yomwe mumasungira deta pa HDFS ndi kumene kuwerengera kumachitika.
  3. Computing node. Awa ndi mfundo pomwe simusunga chilichonse pa HDFS, koma komwe kuwerengera kumachitika.

Mfundo yofunika. Autoscaling idzachitika chifukwa cha mfundo zamtundu wachitatu. Mukayamba kutenga ndikuwonjezera ma node amtundu wachiwiri, liwiro loyankhira lidzakhala lotsika kwambiri - kuchotsa ntchito ndikubwezeretsanso kudzatenga maola ambiri pagulu lanu. Izi, ndithudi, sizomwe mukuyembekezera kuchokera ku autoscaling. Ndiko kuti, sitikhudza mfundo za mtundu woyamba ndi wachiwiri. Adzayimira gulu locheperako lomwe lidzakhalapo panthawi yonse ya pulogalamuyi.

Chifukwa chake, autoscaler yathu idalembedwa mu Python 3, imagwiritsa ntchito Ambari API kuyang'anira ntchito zamagulu, kugwiritsa ntchito API kuchokera ku Mail.ru Cloud Solutions (MCS) yoyambira ndi kuyimitsa makina.

Zomangamanga zothetsera

  1. Gawo autoscaler.py. Lili ndi makalasi atatu: 1) ntchito zogwirira ntchito ndi Ambari, 2) ntchito zogwirira ntchito ndi MCS, 3) ntchito zogwirizana mwachindunji ndi malingaliro a autoscaler.
  2. Zolemba observer.py. Kwenikweni imakhala ndi malamulo osiyanasiyana: liti komanso nthawi ziti zoyimbira ntchito za autoscaler.
  3. Fayilo yosintha config.py. Lili ndi, mwachitsanzo, mndandanda wa ma node omwe amaloledwa autoscaling ndi magawo ena omwe amakhudza, mwachitsanzo, nthawi yoti mudikire kuyambira pomwe node yatsopano idawonjezeredwa. Palinso zizindikiro zanthawi zoyambira makalasi, kotero kuti kalasi isanakhazikitsidwe, kasinthidwe kagulu kololedwa kakukhazikitsidwa.

Tiyeni tsopano tione zidutswa za code mkati mwa owona awiri oyambirira.

1. Autoscaler.py module

Ambari class

Izi ndi zomwe chidutswa cha code chokhala ndi kalasi chimawoneka Ambari:

class Ambari:
    def __init__(self, ambari_url, cluster_name, headers, auth):
        self.ambari_url = ambari_url
        self.cluster_name = cluster_name
        self.headers = headers
        self.auth = auth

    def stop_all_services(self, hostname):
        url = self.ambari_url + self.cluster_name + '/hosts/' + hostname + '/host_components/'
        url2 = self.ambari_url + self.cluster_name + '/hosts/' + hostname
        req0 = requests.get(url2, headers=self.headers, auth=self.auth)
        services = req0.json()['host_components']
        services_list = list(map(lambda x: x['HostRoles']['component_name'], services))
        data = {
            "RequestInfo": {
                "context":"Stop All Host Components",
                "operation_level": {
                    "level":"HOST",
                    "cluster_name": self.cluster_name,
                    "host_names": hostname
                },
                "query":"HostRoles/component_name.in({0})".format(",".join(services_list))
            },
            "Body": {
                "HostRoles": {
                    "state":"INSTALLED"
                }
            }
        }
        req = requests.put(url, data=json.dumps(data), headers=self.headers, auth=self.auth)
        if req.status_code in [200, 201, 202]:
            message = 'Request accepted'
        else:
            message = req.status_code
        return message

Pamwambapa, mwachitsanzo, mutha kuyang'ana kukhazikitsidwa kwa ntchitoyi stop_all_services, yomwe imayimitsa mautumiki onse pamagulu omwe mukufuna.

Pakhomo la kalasi Ambari mwadutsa:

  • ambari_url, mwachitsanzo, ngati 'http://localhost:8080/api/v1/clusters/',
  • cluster_name - dzina la gulu lanu ku Ambari,
  • headers = {'X-Requested-By': 'ambari'}
  • ndi mkati auth nali dzina lanu lolowera ndi mawu achinsinsi a Ambari: auth = ('login', 'password').

Ntchito yokhayo siili chabe kuyimba mafoni angapo kudzera pa REST API kupita ku Ambari. Kuchokera pamalingaliro omveka, timalandira kaye mndandanda wazinthu zomwe zikuyenda pa node, kenako ndikufunsa pagulu lomwe lapatsidwa, pagawo lopatsidwa, kusamutsa mautumiki kuchokera pamndandanda kupita ku boma. INSTALLED. Ntchito zoyambitsa ntchito zonse, kusamutsa ma node kupita ku state Maintenance etc. zikuwoneka zofanana - ndi zopempha zochepa chabe kudzera mu API.

Kalasi ya Mcs

Izi ndi zomwe chidutswa cha code chokhala ndi kalasi chimawoneka Mcs:

class Mcs:
    def __init__(self, id1, id2, password):
        self.id1 = id1
        self.id2 = id2
        self.password = password
        self.mcs_host = 'https://infra.mail.ru:8774/v2.1'

    def vm_turn_on(self, hostname):
        self.token = self.get_mcs_token()
        host = self.hostname_to_vmname(hostname)
        vm_id = self.get_vm_id(host)
        mcs_url1 = self.mcs_host + '/servers/' + self.vm_id + '/action'
        headers = {
            'X-Auth-Token': '{0}'.format(self.token),
            'Content-Type': 'application/json'
        }
        data = {'os-start' : 'null'}
        mcs = requests.post(mcs_url1, data=json.dumps(data), headers=headers)
        return mcs.status_code

Pakhomo la kalasi Mcs timadutsa id ya pulojekiti mkati mwa mtambo ndi id yogwiritsira ntchito, komanso mawu ake achinsinsi. Mu ntchito vm_turn_on tikufuna kuyatsa imodzi mwa makinawo. Mfundo apa ndizovuta kwambiri. Kumayambiriro kwa code, ntchito zina zitatu zimatchedwa: 1) tifunika kupeza chizindikiro, 2) tiyenera kusintha dzina la omvera kukhala dzina la makina mu MCS, 3) kupeza id ya makinawa. Kenako, timangopempha positi ndikuyambitsa makinawa.

Umu ndi momwe ntchito yopezera chizindikiro imawonekera:

def get_mcs_token(self):
        url = 'https://infra.mail.ru:35357/v3/auth/tokens?nocatalog'
        headers = {'Content-Type': 'application/json'}
        data = {
            'auth': {
                'identity': {
                    'methods': ['password'],
                    'password': {
                        'user': {
                            'id': self.id1,
                            'password': self.password
                        }
                    }
                },
                'scope': {
                    'project': {
                        'id': self.id2
                    }
                }
            }
        }
        params = (('nocatalog', ''),)
        req = requests.post(url, data=json.dumps(data), headers=headers, params=params)
        self.token = req.headers['X-Subject-Token']
        return self.token

Autoscaler class

Kalasi iyi ili ndi ntchito zokhudzana ndi malingaliro ogwiritsira ntchito palokha.

Umu ndi momwe kachidutswa ka kalasi iyi kamawonekera:

class Autoscaler:
    def __init__(self, ambari, mcs, scaling_hosts, yarn_ram_per_node, yarn_cpu_per_node):
        self.scaling_hosts = scaling_hosts
        self.ambari = ambari
        self.mcs = mcs
        self.q_ram = deque()
        self.q_cpu = deque()
        self.num = 0
        self.yarn_ram_per_node = yarn_ram_per_node
        self.yarn_cpu_per_node = yarn_cpu_per_node

    def scale_down(self, hostname):
        flag1 = flag2 = flag3 = flag4 = flag5 = False
        if hostname in self.scaling_hosts:
            while True:
                time.sleep(5)
                status1 = self.ambari.decommission_nodemanager(hostname)
                if status1 == 'Request accepted' or status1 == 500:
                    flag1 = True
                    logging.info('Decomission request accepted: {0}'.format(flag1))
                    break
            while True:
                time.sleep(5)
                status3 = self.ambari.check_service(hostname, 'NODEMANAGER')
                if status3 == 'INSTALLED':
                    flag3 = True
                    logging.info('Nodemaneger decommissioned: {0}'.format(flag3))
                    break
            while True:
                time.sleep(5)
                status2 = self.ambari.maintenance_on(hostname)
                if status2 == 'Request accepted' or status2 == 500:
                    flag2 = True
                    logging.info('Maintenance request accepted: {0}'.format(flag2))
                    break
            while True:
                time.sleep(5)
                status4 = self.ambari.check_maintenance(hostname, 'NODEMANAGER')
                if status4 == 'ON' or status4 == 'IMPLIED_FROM_HOST':
                    flag4 = True
                    self.ambari.stop_all_services(hostname)
                    logging.info('Maintenance is on: {0}'.format(flag4))
                    logging.info('Stopping services')
                    break
            time.sleep(90)
            status5 = self.mcs.vm_turn_off(hostname)
            while True:
                time.sleep(5)
                status5 = self.mcs.get_vm_info(hostname)['server']['status']
                if status5 == 'SHUTOFF':
                    flag5 = True
                    logging.info('VM is turned off: {0}'.format(flag5))
                    break
            if flag1 and flag2 and flag3 and flag4 and flag5:
                message = 'Success'
                logging.info('Scale-down finished')
                logging.info('Cooldown period has started. Wait for several minutes')
        return message

Timavomereza makalasi kuti tilowe. Ambari ΠΈ Mcs, mndandanda wa ma node omwe amaloledwa kukulitsa, komanso magawo osintha ma node: kukumbukira ndi cpu zoperekedwa ku mfundo mu YARN. Palinso magawo awiri amkati q_ram, q_cpu, omwe ndi mizere. Kugwiritsa ntchito, timasunga zomwe zili mugulu lamakono. Ngati tiwona kuti pa mphindi 2 zapitazi pakhala kuchuluka kwachulukidwe kosalekeza, ndiye tikuganiza kuti tifunika kuwonjezera +5 node ku tsango. N'chimodzimodzinso ndi cluster underutilization state.

Khodi yomwe ili pamwambapa ndi chitsanzo cha ntchito yomwe imachotsa makina pagulu ndikuyimitsa mumtambo. Choyamba ndi kuchotsedwa ntchito YARN Nodemanager, ndiye mode kuyatsa Maintenance, ndiye timayimitsa ntchito zonse pamakina ndikuzimitsa makina enieni mumtambo.

2. Script monitorer.py

Zitsanzo za code kuchokera pamenepo:

if scaler.assert_up(config.scale_up_thresholds) == True:
        hostname = cloud.get_vm_to_up(config.scaling_hosts)
        if hostname != None:
            status1 = scaler.scale_up(hostname)
            if status1 == 'Success':
                text = {"text": "{0} has been successfully scaled-up".format(hostname)}
                post = {"text": "{0}".format(text)}
                json_data = json.dumps(post)
                req = requests.post(webhook, data=json_data.encode('ascii'), headers={'Content-Type': 'application/json'})
                time.sleep(config.cooldown_period*60)

Momwemo, timayang'ana ngati mikhalidwe idapangidwa kuti iwonjezere kuchuluka kwa gululo komanso ngati pali makina aliwonse omwe asungidwa, pezani dzina la omwe akukhala nawo, onjezani gululo ndikufalitsa uthenga wokhudza gulu lathu la Slack. Pambuyo pake zimayamba cooldown_period, pamene sitikuwonjezera kapena kuchotsa chirichonse pagulu, koma kungoyang'anitsitsa katunduyo. Ngati yakhazikika ndipo ili mkati mwa mayendedwe okwera kwambiri, ndiye kuti timangopitiliza kuyang'anira. Ngati mfundo imodzi sinali yokwanira, ndiye timawonjezera ina.

Kwa milandu tikakhala ndi phunziro patsogolo, timadziwa kale kuti node imodzi sikhala yokwanira, choncho nthawi yomweyo timayamba mfundo zonse zaulere ndikuwasunga mpaka kumapeto kwa phunzirolo. Izi zimachitika pogwiritsa ntchito mindandanda yanthawi yantchito.

Pomaliza

Autoscaler ndi yankho labwino komanso losavuta pamilanduyi mukakumana ndi kutsitsa kosagwirizana. Munthawi yomweyo mumakwaniritsa masinthidwe amagulu omwe amafunidwa kuti akachuluke kwambiri ndipo nthawi yomweyo osasunga gululi panthawi yotsitsa, kusunga ndalama. Chabwino, kuphatikiza izi zonse zimachitika zokha popanda kutenga nawo mbali. The autoscaler palokha si kanthu koma zopempha kwa cluster manager API ndi cloud provider API, olembedwa malinga ndi logic inayake. Zomwe muyenera kukumbukira ndikugawa ma node kukhala mitundu itatu, monga tidalembera kale. Ndipo mudzakhala osangalala.

Source: www.habr.com

Kuwonjezera ndemanga