ืฉืืื! ืื ื ืืืื ืื ืื ืฉืื ืืขืืื ืขื ืืื ืืืื. ืื ืืคืฉืจ ืืืืืื ืชืืื ืืช ืืื ืืืืช ืื ืืฉื ืืื ืืืื ืืื ืืฉืืื ืืฉืื, ืฉืขืืื ืขืืืืื ืืื ืื ืืืฉืชืชืคืื. ืืกืืื ืื, ืืชืืื ืืช ืฉืื ื ืืฉ ืืช ืื ืชืืื ๐ ืื ืื ื ืขืืกืงืื ืืชืฆืืจื, ืืืื ืื ืื ืืืื ืฉืื, ืืืืืจ'ื ืืฉืืงืื ืฉื ืืฉืืจืืช ืืฉืจืืช MapReduce ืืืฉืชืืฉืื ื-Spark.
ืืคืืกื ืื ื ืกืคืจ ืืื ืืืฆื ืคืชืจื ื ืืช ืืขืืืช ืืขืื ืช ืืืฉืืืืืช ืืื ืืืืื ืขื ืืื ืืชืืืช ื-autoscaler ืฉืื ื ืืืืฆืขืืช ืืขื ื
ืืขืื
ืืืฉืืื ืฉืื ื ืืื ื ืืฉืืืืฉ ืืืฆื ืืืคืืกื. ืืกืืืืง ืืื ืืืื ืื ืืืื. ืืืฉื, ืืฉ ืฉืืขืืจืื ืืขืฉืืื, ืืืฉืจ ืื 30 ืืื ืฉืื ืืืืจื ืืืืืื ืืืฉืืื ืืืชืืืืื ืืืฉืชืืฉ ืื. ืื ืฉืื, ืืฉ ืืืื ืืคื ื ืืืืขื ืฉืื ืืขืืืก ืืื ืืืื. ืืฉืืจ ืืืื ืืืฉืืื ืคืืขื ืืืฆื ืชืช ืขืืืก.
ืคืชืจืื ืืก' 1 ืืื ืืฉืืืจ ืขื ืืฉืืื ืฉืืขืืื ืืขืืืกื ืฉืื, ืืื ืืืื ืคืขืื ืืฉืืจ ืืืื.
ืคืชืจืื ืืก' 2 ืืื ืืฉืืืจ ืขื ืืฉืืื ืงืื, ืืืื ืืชื ืืืกืืฃ ืืื ืืช ืฆืืชืื ืืคื ื ืฉืืขืืจืื ืืืืืื ืขืืืกื ืฉืื.
ืคืชืจืื ืืก' 3 ืืื ืืฉืืืจ ืขื ืืฉืืื ืงืื ืืืืชืื ืืืืืกืงืืืจ ืฉืื ืืจ ืืช ืืขืืืก ืื ืืืื ืฉื ืืืฉืืื, ืืืืืฆืขืืช ืืืฉืงื API ืฉืื ืื, ืืืืกืืฃ ืืืืกืืจ ืฆืืชืื ืืืืฉืืื.
ืืคืืกื ืืื ื ืืืจ ืขื ืคืชืจืื ืืก' 3. ื-autoscaler ืืื ืชืืื ืืืื ืืืืจืืื ืืืฆืื ืืื ืืื ืคื ืืืืื, ืืืขืืชืื ืกืคืงืื ืื ืืกืคืงืื ืืืชื. ืื ื ืืฉืชืืฉืื ืืชืฉืชืืช ืืขื ื ืฉื Mail.ru Cloud Solutions ืืืชืื ื ืืืื ืืืืืืื ืืืืฆืขืืช ื-API ืฉื MCS. ืืืืืืื ืฉืื ืื ื ืืืืืื ืืื ืืขืืื ืขื ื ืชืื ืื, ืืืืื ื ืืืจืืืช ืืื ืืชื ืืืื ืืืชืื ืืืืืกืงืืืจ ืืืื ืืืืจืืช ืฉืื ืืืืฉืชืืฉ ืื ืขื ืืขื ื ืฉืื
ืชื ืืื ืืืงืืืื
ืจืืฉืืช, ืขืืื ืืืืืช ืืขื ืืฉืืื Hadoop. ืืืืืื, ืื ื ืืฉืชืืฉืื ืืืคืฆืช HDP.
ืขื ืื ืช ืฉืืฆืืชืื ืฉืื ืืืกืืคื ืืืืกืจื ืืืืืจืืช, ืืืืืช ืืืืืช ืื ืืืืงื ืืกืืืืช ืฉื ืชืคืงืืืื ืืื ืืฆืืชืื.
- ืฆืืืช ืืืกืืจ. ืืืื, ืืื ืฆืืจื ืืืกืืืจ ืฉืื ืืืจ ืืืืืื: ืืฆืืืช ืืจืืฉื ืฉื ืืืฉืืื, ืฉืื, ืืืฉื, ืืืคืขื ืื ืื ืืืชืงื Spark, ืื ืืชื ืืฉืชืืฉ ืืืฆื ืืืื ืืจืืงืืืื.
- ืฆืืืช ืชืืจืื. ืืื ืืฆืืืช ืฉืื ืืชื ืืืืกื ื ืชืื ืื ื-HDFS ืืฉื ืืชืืฆืขืื ืืืฉืืืื.
- ืฆืืืช ืืืฉืื. ืืื ืฆืืืช ืฉืื ืืชื ืื ืืืืกื ืฉืื ืืืจ ื-HDFS, ืืื ืฉืื ืืชืจืืฉืื ืืืฉืืืื.
ื ืงืืื ืืฉืืื. ืงื ื ืืืื ืืืืืืื ืืชืจืืฉ ืขืงื ืฆืืชืื ืืืกืื ืืฉืืืฉื. ืื ืชืชืืื ืืงืืช ืืืืืกืืฃ ืฆืืชืื ืืืกืื ืืฉื ื, ืืืืจืืช ืืชืืืื ืชืืื ื ืืืื ืืืื - ืืืืื ืืืืฆืืข ืืืืฉ ืืืงื ืฉืขืืช ืขื ืืืฉืืื ืฉืื. ืื, ืืืืื, ืื ืื ืฉืืชื ืืฆืคื ืืฉืื ืื ืงื ื ืืืื ืืืืืืื. ืืืืืจ, ืื ืื ื ืื ื ืืืขืื ืืฆืืชืื ืืืกืื ืืจืืฉืื ืืืฉื ื. ืื ืืืฆืื ืืฉืืื ืืื ืืืื ืืจ-ืงืืืื ืฉืืชืงืืื ืืืืจื ืื ืืฉื ืืชืืื ืืช.
ืื, ื-autoscaler ืฉืื ื ืืชืื ื-Python 3, ืืฉืชืืฉ ื-Ambari API ืื ืืืื ืฉืืจืืชื ืืฉืืืืืช, ืืฉืชืืฉ
ืืจืืืืงืืืจืช ืคืชืจืื ืืช
- ะะพะดัะปั
autoscaler.py
. ืืื ืืืื ืฉืืืฉ ืืืืงืืช: 1) ืคืื ืงืฆืืืช ืืขืืืื ืขื Ambari, 2) ืคืื ืงืฆืืืช ืืขืืืื ืขื MCS, 3) ืคืื ืงืฆืืืช ืืงืฉืืจืืช ืืฉืืจืืช ืืืืืืงื ืฉื ื-autoscaler. - ืชึทืกืจึดืื
observer.py
. ืืขืืงืจื ืฉื ืืืจ ืื ืืืจืื ืืืืงืื ืฉืื ืื: ืืชื ืืืืืื ืจืืขืื ืืงืจืื ืืคืื ืงืฆืืืช ืืืืืืกืงืืจ. - ืงืืืฅ ืชืฆืืจื
config.py
. ืืื ืืืื, ืืืฉื, ืจืฉืืื ืฉื ืฆืืชืื ืืืืชืจืื ืืฉืื ืื ืงื ื ืืืื ืืืืืืื ืืคืจืืืจืื ื ืืกืคืื ืืืฉืคืืขืื, ืืืฉื, ืขื ืืื ืืื ืืืืืช ืืจืืข ืืืกืคืช ืฆืืืช ืืืฉ. ืืฉ ืื ืืืชืืืช ืืื ืืชืืืืช ืืฉืืขืืจืื, ืื ืฉืืคื ื ืืืืืงื ืืืคืขืืช ืชืฆืืจืช ืืืฉืืื ืืืงืกืืืืืช ืืืืชืจืช.
ืืื ื ืกืชืื ืืขืช ืขื ืคืืกืืช ืืงืื ืืชืื ืฉื ื ืืงืืฆืื ืืจืืฉืื ืื.
1. ืืืืื Autoscaler.py
ืืืชืช ืืืืจื
ืื ื ืจืืืช ืงืืข ืงืื ืฉืืืื ืืืืงื Ambari
:
class Ambari:
def __init__(self, ambari_url, cluster_name, headers, auth):
self.ambari_url = ambari_url
self.cluster_name = cluster_name
self.headers = headers
self.auth = auth
def stop_all_services(self, hostname):
url = self.ambari_url + self.cluster_name + '/hosts/' + hostname + '/host_components/'
url2 = self.ambari_url + self.cluster_name + '/hosts/' + hostname
req0 = requests.get(url2, headers=self.headers, auth=self.auth)
services = req0.json()['host_components']
services_list = list(map(lambda x: x['HostRoles']['component_name'], services))
data = {
"RequestInfo": {
"context":"Stop All Host Components",
"operation_level": {
"level":"HOST",
"cluster_name": self.cluster_name,
"host_names": hostname
},
"query":"HostRoles/component_name.in({0})".format(",".join(services_list))
},
"Body": {
"HostRoles": {
"state":"INSTALLED"
}
}
}
req = requests.put(url, data=json.dumps(data), headers=self.headers, auth=self.auth)
if req.status_code in [200, 201, 202]:
message = 'Request accepted'
else:
message = req.status_code
return message
ืืืขืื, ืืืืืื, ืืชื ืืืื ืืืกืชืื ืขื ืืืฉืื ืืคืื ืงืฆืื stop_all_services
, ืฉืขืืฆืจ ืืช ืื ืืฉืืจืืชืื ืืฆืืืช ืืืฉืืื ืืจืฆืื.
ืืื ืืกื ืืืืชื Ambari
ืขืืจืช:
ambari_url
, ืืืฉื, ืืื'http://localhost:8080/api/v1/clusters/'
,cluster_name
- ืฉื ืืืฉืืื ืฉืื ืืืืืจื,headers = {'X-Requested-By': 'ambari'}
- ืืืคื ืื
auth
ืื ื ืคืจืื ืืื ืืกื ืืืกืืกืื ืฉืื ืขืืืจ Ambari:auth = ('login', 'password')
.
ืืคืื ืงืฆืื ืขืฆืื ืืื ืื ืืืชืจ ืืืื ืฉืืืืช ืืจื REST API ืืืืืืจื. ืื ืงืืืช ืืื ืืืืื ืืช, ืื ื ืืงืืืื ืชืืืื ืจืฉืืื ืฉื ืฉืืจืืชืื ืคืืขืืื ืืฆืืืช, ืืืืืจ ืืื ืืืงืฉืื ืืืฉืืื ื ืชืื, ืืฆืืืช ื ืชืื, ืืืขืืืจ ืฉืืจืืชืื ืืืจืฉืืื ืืืืื ื INSTALLED
. ืคืื ืงืฆืืืช ืืืคืขืืช ืื ืืฉืืจืืชืื, ืืืขืืจืช ืฆืืชืื ืืืฆื Maintenance
ืืื' ื ืจืืื ืืืืื - ืืื ืจืง ืืื ืืงืฉืืช ืืจื ื-API.
ืืืืงื ืืง'
ืื ื ืจืืืช ืงืืข ืงืื ืฉืืืื ืืืืงื Mcs
:
class Mcs:
def __init__(self, id1, id2, password):
self.id1 = id1
self.id2 = id2
self.password = password
self.mcs_host = 'https://infra.mail.ru:8774/v2.1'
def vm_turn_on(self, hostname):
self.token = self.get_mcs_token()
host = self.hostname_to_vmname(hostname)
vm_id = self.get_vm_id(host)
mcs_url1 = self.mcs_host + '/servers/' + self.vm_id + '/action'
headers = {
'X-Auth-Token': '{0}'.format(self.token),
'Content-Type': 'application/json'
}
data = {'os-start' : 'null'}
mcs = requests.post(mcs_url1, data=json.dumps(data), headers=headers)
return mcs.status_code
ืืื ืืกื ืืืืชื Mcs
ืื ืื ื ืืขืืืจืื ืืช ืืืื ืืคืจืืืงื ืืชืื ืืขื ื ืืืช ืืืื ืืืฉืชืืฉ, ืืื ืื ืืช ืืกืืกืื ืฉืื. ืืชืคืงืื vm_turn_on
ืื ืื ื ืจืืฆืื ืืืคืขืื ืืช ืืืช ืืืืื ืืช. ืืืืืืื ืืื ืงืฆืช ืืืชืจ ืืกืืื. ืืชืืืืช ืืงืื ื ืงืจืืืช ืฉืืืฉ ืคืื ืงืฆืืืช ื ืืกืคืืช: 1) ืื ืื ื ืฆืจืืืื ืืงืื ืืกืืืื, 2) ืื ืื ื ืฆืจืืืื ืืืืืจ ืืช ืฉื ืืืืจื ืืฉื ืฉื ืืืืื ื ื-MCS, 3) ืืงืื ืืช ืืืืื ืฉื ืืืืื ื ืืื. ืืืืจ ืืื, ืื ื ืคืฉืื ืฉืืืืื ืืงืฉื ืืคืจืกืื ืืืคืขืืืื ืืช ืืืืื ื ืืื.
ืื ื ืจืืืช ืืคืื ืงืฆืื ืืืฉืืช ืืกืืืื:
def get_mcs_token(self):
url = 'https://infra.mail.ru:35357/v3/auth/tokens?nocatalog'
headers = {'Content-Type': 'application/json'}
data = {
'auth': {
'identity': {
'methods': ['password'],
'password': {
'user': {
'id': self.id1,
'password': self.password
}
}
},
'scope': {
'project': {
'id': self.id2
}
}
}
}
params = (('nocatalog', ''),)
req = requests.post(url, data=json.dumps(data), headers=headers, params=params)
self.token = req.headers['X-Subject-Token']
return self.token
ืืืชืช Autoscaler
ืืืืงื ืื ืืืืื ืคืื ืงืฆืืืช ืืงืฉืืจืืช ืืืืืืงื ืืืคืขืื ืขืฆืื.
ืื ื ืจืืืช ืงืืข ืงืื ืขืืืจ ืืืืงื ืื:
class Autoscaler:
def __init__(self, ambari, mcs, scaling_hosts, yarn_ram_per_node, yarn_cpu_per_node):
self.scaling_hosts = scaling_hosts
self.ambari = ambari
self.mcs = mcs
self.q_ram = deque()
self.q_cpu = deque()
self.num = 0
self.yarn_ram_per_node = yarn_ram_per_node
self.yarn_cpu_per_node = yarn_cpu_per_node
def scale_down(self, hostname):
flag1 = flag2 = flag3 = flag4 = flag5 = False
if hostname in self.scaling_hosts:
while True:
time.sleep(5)
status1 = self.ambari.decommission_nodemanager(hostname)
if status1 == 'Request accepted' or status1 == 500:
flag1 = True
logging.info('Decomission request accepted: {0}'.format(flag1))
break
while True:
time.sleep(5)
status3 = self.ambari.check_service(hostname, 'NODEMANAGER')
if status3 == 'INSTALLED':
flag3 = True
logging.info('Nodemaneger decommissioned: {0}'.format(flag3))
break
while True:
time.sleep(5)
status2 = self.ambari.maintenance_on(hostname)
if status2 == 'Request accepted' or status2 == 500:
flag2 = True
logging.info('Maintenance request accepted: {0}'.format(flag2))
break
while True:
time.sleep(5)
status4 = self.ambari.check_maintenance(hostname, 'NODEMANAGER')
if status4 == 'ON' or status4 == 'IMPLIED_FROM_HOST':
flag4 = True
self.ambari.stop_all_services(hostname)
logging.info('Maintenance is on: {0}'.format(flag4))
logging.info('Stopping services')
break
time.sleep(90)
status5 = self.mcs.vm_turn_off(hostname)
while True:
time.sleep(5)
status5 = self.mcs.get_vm_info(hostname)['server']['status']
if status5 == 'SHUTOFF':
flag5 = True
logging.info('VM is turned off: {0}'.format(flag5))
break
if flag1 and flag2 and flag3 and flag4 and flag5:
message = 'Success'
logging.info('Scale-down finished')
logging.info('Cooldown period has started. Wait for several minutes')
return message
ืื ื ืืงืืืื ืฉืืขืืจืื ืืื ืืกื. Ambari
ะธ Mcs
, ืจืฉืืื ืฉื ืฆืืชืื ืืืืชืจืื ืืฉืื ืื ืงื ื ืืืื, ืืื ืคืจืืืจืื ืฉื ืชืฆืืจืช ืฆืืืช: ืืืืจืื ืืืขืื ืืืืงืฆืื ืืฆืืืช ื-YARN. ืืฉื ื ืื 2 ืคืจืืืจืื ืคื ืืืืื q_ram, q_cpu, ืฉืื ืชืืจืื. ืืืืฆืขืืชื, ืื ื ืืืืกื ืื ืืช ืืขืจืืื ืฉื ืขืืืก ืืืฉืืื ืื ืืืื. ืื ืื ื ืจืืืื ืฉืืืืื 5 ืืืงืืช ืืืืจืื ืืช ืืื ืขืืืก ืืืืืจ ืืขืงืืืืช, ืื ื ืืืื ืฉืขืืื ื ืืืืกืืฃ ืฆืืืช +1 ืืืฉืืื. ืืืืจ ื ืืื ืื ืืืื ืืฆื ืชืช-ื ืืฆืื ืืืฉืืืืืช.
ืืงืื ืืืขืื ืืื ืืืืื ืืคืื ืงืฆืื ืฉืืกืืจื ืืืื ื ืืืืฉืืื ืืขืืฆืจืช ืืืชื ืืขื ื. ืจืืฉืืช ืืฉ ืคืืจืืง YARN Nodemanager
, ืืื ืืืฆื ื ืืืง Maintenance
, ืื ืื ืื ื ืขืืฆืจืื ืืช ืื ืืฉืืจืืชืื ืืืืื ื ืืืืืื ืืช ืืืืื ื ืืืืจืืืืืืช ืืขื ื.
2. Script observer.py
ืงืื ืืืืืื ืืฉื:
if scaler.assert_up(config.scale_up_thresholds) == True:
hostname = cloud.get_vm_to_up(config.scaling_hosts)
if hostname != None:
status1 = scaler.scale_up(hostname)
if status1 == 'Success':
text = {"text": "{0} has been successfully scaled-up".format(hostname)}
post = {"text": "{0}".format(text)}
json_data = json.dumps(post)
req = requests.post(webhook, data=json_data.encode('ascii'), headers={'Content-Type': 'application/json'})
time.sleep(config.cooldown_period*60)
ืื, ืื ื ืืืืงืื ืืื ื ืืฆืจื ืชื ืืื ืืืืืืช ืืงืืืืืช ืฉื ืืืฉืืื ืืืื ืืฉ ืืืื ืืช ืืืฉืื ืืืืืืืื, ืืงืืืื ืืช ืฉื ืืืืจื ืฉื ืืืช ืืื, ืืืกืืคืื ืืืชื ืืืฉืืื ืืืคืจืกืืื ืืืืขื ืขื ืื ื-Slack ืฉื ืืฆืืืช ืฉืื ื. ืืืจื ืื ืื ืืชืืื cooldown_period
, ืืืฉืจ ืื ืื ื ืื ืืืกืืคืื ืื ืืกืืจืื ืฉืื ืืืจ ืืืืฉืืื, ืืื ืคืฉืื ืขืืงืืื ืืืจ ืืขืืืก. ืื ืืื ืืชืืืฆื ืื ืืฆื ืืืกืืจืื ืฉื ืขืจืื ืขืืืก ืืืคืืืืืืื, ืื ืื ืื ื ืคืฉืื ืืืฉืืืื ืืืขืงื. ืื ืฆืืืช ืืื ืื ืืกืคืืง, ื ืืกืืฃ ืขืื ืืื.
ืืืงืจืื ืฉืืื ืืฉ ืื ื ืฉืืขืืจ ืืคื ืื ื, ืื ืื ื ืืืจ ืืืืขืื ืืืืืืืช ืฉืฆืืืช ืืื ืื ืืกืคืืง, ืื ืื ืื ื ืืชืืืืื ืืื ืืช ืื ืืฆืืชืื ืืคื ืืืื ืืืฉืืืจืื ืืืชื ืคืขืืืื ืขื ืกืืฃ ืืฉืืขืืจ. ืื ืงืืจื ืืืืฆืขืืช ืจืฉืืื ืฉื ืืืชืืืช ืืื ืฉื ืคืขืืืืช.
ืืกืงื ื
Autoscaler ืืื ืคืชืจืื ืืื ืื ืื ืืืืชื ืืงืจืื ืฉืืื ืืชื ืืืื ืืขืื ืช ืืฉืืื ืื ืืืืื. ืืชื ืืฉืื ืื ืืื ืืช ืืช ืชืฆืืจืช ืืืฉืืื ืืจืฆืืื ืขืืืจ ืขืืืกื ืฉืื ืืื ืืื ืืช ืื ืฉืืืจืื ืขื ืืฉืืื ืื ืืืื ืขืืืก ื ืืื, ืืืืกื ืืกืฃ. ืืืื, ืื ืืกืฃ ืื ืื ืงืืจื ืืืืืืืืช ืืื ืืฉืชืชืคืืชื. ื-autoscaler ืขืฆืื ืืื ืื ืืืชืจ ืืกื ืฉื ืืงืฉืืช ื-API ืฉื ืื ืื ืืืฉืืืืืช ืื-API ืฉื ืกืคืง ืืขื ื, ืฉื ืืชืื ืขื ืคื ืืืืืื ืืกืืื. ืื ืฉืืชื ืืืืื ืฆืจืื ืืืืืจ ืืื ืืืืงืช ืืฆืืชืื ื-3 ืกืืืื, ืืคื ืฉืืชืื ื ืงืืื ืืื. ืืืชื ืชืืื ืืืืฉืจ.
ืืงืืจ: www.habr.com