αα½ααααΈ! ααΎαααααα»ααααααΆαααα»αααα±ααααααΎααΆαααΆαα½ααα·ααααααααα ααΆαα·αα’αΆα αα αα½α αααααα»αααΆααααααααΎααααααα·ααΈα’ααααααΎαα·αααααααααααααααΆαα αααααααααααΆ αααα’αααα αΌααα½αααΆααα’ααααααΎααΆαααΆαα½αααααΆα αααααΆααα ααα»ααααα αααααα·ααΈααααααΎααααααααΆαααΆ :) ααΎαααΆαα αΌααα½ααααα»αααΆαααααααα ααΆαααααααα ααΆααααααααΌα αα·αααΆααααααααααααααααΆ α αΎααα»ααα αΆααααααΎαααΆαααΆα MapReduce αααααααΆαααα ααΈααα α αΎαααααΎ Spark α
αα
αααα»αααΆααααααΆαααα ααΎαααΉαααααΆααα’αααααΈαααααααααΎααααααααΆααααα αΆααααΆααααα»αα
ααααααα·αααααΎααααΆαααααΆαααααα autoscaler ααααααΎαααααΆαααααααααΎ cloud
αααα αΆ
α αααααααααααΎααα·αααααΌαααΆαααααΎαααα»αααααααααααααΆααα ααΆααααα ααααΊαα·αααααΎααααΆααααΆααα ααΆα§ααΆα ααα ααΆαααααΆααα’αα»αααα αα ααααααααα»αααααΆαα 30 ααΆαα αα·αααααΌααααΆαααα α ααααα α αΎαα αΆααααααΎαααααΎααΆα α¬ααααααα ααΆααααααα»ααααααα»αααααα αα ααααααααααα»αααΎαα‘αΎααααΆαααααΆααα αα ααααααααααααΆαααα αααααααααΎαααΆααααα»ααααααα·ααααα»αα
αααααααααΆαααΈ 1 ααΊαααααΆα ααααααααααΉαααααααααΉαααααα»αααααααααα»α ααα»ααααααΉααα αααααα’αααααααααα αααα
αααααααααΆα #2 ααΊααααΌααααααΆα αααααααΌα αα½α αααα’αααααααααααααΆααααααααα»αααααΆαα αα·αααα‘α»αααααααα»αααααααααα»αα
αααααααααΆα #3 ααΊαααααΆα αααααααΌα αα½α α αΎαααααα autoscaler αααααΉααααα½ααα·αα·αααααααα»ααα αα α»αααααααα cluster α αΎααααααααΎ APIs αααααα αααααα αα·ααα»αααααΆααα ααααΈ cluster α
αα αααα»αααΆααααααΆαααα ααΎαααΉααα·ααΆαα’αααΈαααααααααΆαααα 3 α α§αααααααΆαααααααααααααααααααα·αααααΉαααα’ααααααΆααααΎαααααΆααΆααααα ααΆααΆααααααΆααΆααααα»α α αΎαα’ααααααααααααΆααΆααΏαααα·ααααααααΆααα ααΎαααααΎααααΆααα αααααΆαα ααΆααααααααααα Mail.ru Cloud Solutions α αΎαααααααααααα·ααΈααααΎααΆαααααααΆααααααααααααααααα·αααααααΎ MCS API α α αΎαα αΆααααΆααααΈααΎαααααααααΈααααααααΎααΆαααΆαα½ααα·αααααα ααΎαααΆααααααα α α·ααααααα αΆαααΈαααααααα’αααα’αΆα ααααα autoscaler ααααααααααΆαααααΆαααααααααααααΆαααααα½αααααα’ααα α αΎαααααΎααΆααΆαα½α cloud ααααα’ααα
αααααΌαααΆαααΆαα»α
ααααΌαα’αααααααΌαααααΆαα ααααα Hadoop α α§ααΆα ααα ααΎαααααΎααΆαα ααα αΆα HDP α
ααΎααααΈα±ααααααΆααααααα’αααααααΌαααΆααααααα αα·αααα αααααΆαααΆαααα αα α’αααααααΌαααααΆαααΆαα ααα αΆααα½ααΆααΈααΆααααΆαααααα»αα ααααααααΆααα
- ααααΆααααα ααΆααΆααααααΎαααΆαα ααααΆαα’αααΈα αΆαααΆα αααΎααααΈαααααααα ααΈαααααα ααααΆααααααΆααααα ααααα αααα§ααΆα ααα αααααα·ααΈαααααΆ Spark ααααΌαααΆαααΎαααααΎαααΆα ααααα·αααΎα’αααααααΎααααα’ααααααααα
- ααααΆααααΆαααα·α ααααα αααααΊααΆααααΆαααααα’ααααααααΆαα»ααα·αααααααα ααΎ HDFS αα·ααααααααααααΆαααααΆααΎαα‘αΎαα
- ααααΆαααα»αααααΌαααα αααααΊααΆααααΆαααααα’ααααα·ααααααΆαα»αα’αααΈαα ααΎ HDFS ααα»ααααααΆαααααααααααΆαααααΆααΎαα‘αΎαα
α ααα»α ααααΆααα ααΆαααααΎααΆαααααααΆααααααααααααααααα·ααΉαααΎαα‘αΎααααααΆαααααΆααααααααααααΈααΈα ααααα·αααΎα’αααα αΆααααααΎαααα½ααα αα·αααααααααααΆααααααααααααΈααΈα ααααΏαααααΎαααααΉαααΆαααΆαα - ααΆααααααα αα·ααααα αΌαα‘αΎααα·αααΉαα αααΆααααα αααΎααααααα ααΎα αααααααααα’αααα ααΆααΆααα·αααΆαα ααααα·ααααααΆα’αααΈαααα’αααααααΉαααΈααΆαααααΎααΆαααααααΆααααααααααααααααα·αααααα αααααΊααΎααα·ααααααααΆααααααααααααΈαα½ααα·αααΈααΈαααα αα½αααΆααΉαααααΆαα±αααααα»ααααα’αΆα αααααα ααΆαα’αααααααΆαααααΉαααΆαααααα½ααααααααααααααα·ααΈα
ααΌα
αααα autoscaler ααααααΎαααααΌαααΆαααααααα
αααα»α Python 3 ααααΎααααΆαα Ambari API ααΎααααΈαααααααααααααΆαααα cluster ααααΎααααΆαα
ααααΆαααααααααααααααααΆα
- αααΌαα»α
autoscaler.py
. ααΆααΆαααΈααααΆααα 1) αα»αααΆααααααΆααααααΎααΆαααΆαα½α Ambari, 2) αα»αααΆααααααΆααααααΎααΆαααΆαα½α MCS, 3) αα»αααΆααααααΆαααααααααααΆαααα ααΉααααααα·ααααΆαααα autoscaler α - ααααααΈα
observer.py
. ααααΆααααΆααΆαα αααΆααααααααααααΆα αααααΆ αα·ααααααΆαα½αααΎααααΈα α αα»αααΆα autoscaler α - α―αααΆαααααααα
ααΆαααααααα
config.py
. α§ααΆα ααα ααΆααΆααααααΈααααΆαααααα’αα»ααααΆααααααΆααααΆααααααααα αααααααααααααα· αα·ααααΆαααΆαααααααααααααααααααα₯αααα·αα α§ααΆα ααα αααααααααααααΌααααα αΆαα αΆααααΈααααααααααΆααααααΈααααΌαααΆαααααααα ααΆααααΆαααααΆαααααααΆαααααΆααααΆαα αΆααααααΎαααααΆααααααα ααΌα αααααα»ααααααααΆαα ααΆαααααααα ααΆααααααααα ααααααααα’αα»ααααΆαα’αα·ααααΆααααΌαααΆαα αΆααααααΎαα
α₯α‘αΌαααα ααΌααααα‘ααααΎααααααααααΌααα αααα»αα―αααΆαααΈαααααΌαα
1. αααΌαα»α Autoscaler.py
ααααΆαα Ambari
αααααΊααΆα’αααΈααααααααααααΌααααααΆαααααΆααααΎααα
ααΌα
Ambari
:
class Ambari:
def __init__(self, ambari_url, cluster_name, headers, auth):
self.ambari_url = ambari_url
self.cluster_name = cluster_name
self.headers = headers
self.auth = auth
def stop_all_services(self, hostname):
url = self.ambari_url + self.cluster_name + '/hosts/' + hostname + '/host_components/'
url2 = self.ambari_url + self.cluster_name + '/hosts/' + hostname
req0 = requests.get(url2, headers=self.headers, auth=self.auth)
services = req0.json()['host_components']
services_list = list(map(lambda x: x['HostRoles']['component_name'], services))
data = {
"RequestInfo": {
"context":"Stop All Host Components",
"operation_level": {
"level":"HOST",
"cluster_name": self.cluster_name,
"host_names": hostname
},
"query":"HostRoles/component_name.in({0})".format(",".join(services_list))
},
"Body": {
"HostRoles": {
"state":"INSTALLED"
}
}
}
req = requests.put(url, data=json.dumps(data), headers=self.headers, auth=self.auth)
if req.status_code in [200, 201, 202]:
message = 'Request accepted'
else:
message = req.status_code
return message
ααΆαααΎααΆα§ααΆα αααα’αααα’αΆα
ααΎαααΆαα’αα»αααααα»αααΆα stop_all_services
αααααααααααααΆααααααΆααα’αααα
ααΎααααΆααα
ααααααααα
ααααΆαα
αα
α
αααα
αΌαααααΆααααα Ambari
α’αααααααααΆααα
ambari_url
α§ααΆα αααααΌα ααΆ'http://localhost:8080/api/v1/clusters/'
,cluster_name
- ααααααααα»αααααα’ααααα Ambari,headers = {'X-Requested-By': 'ambari'}
- αα·αααΆααααα»α
auth
αααααΊααΆαααααα’αααααααΎααααΆαα αα·αααΆααααααααΆααααααα’ααααααααΆαα Ambariαauth = ('login', 'password')
.
αα»αααΆααααα½αααΆααΊααααΆαα’αααΈαααα
ααΈααΆαα α
ααΈαααΈααααΆαααα REST API αα
ααΆαα Ambari αααααα ααΆαααααααα‘αΌααΈαα ααααΌαααΎαααα½αααΆααααααΈααααΆααααααααααα»αααααΎαααΆααα
ααΎααααΆαααα½α α αΎααααααΆαααααα½ααα
ααΎα
ααααααααααΆααααααα±αα αα
ααΎααααΆαααααααΆααααααα±αα ααΎααααΈαααααααααΆααααααΈαααααΈαα
ααααα INSTALLED
. αα»αααΆααααααΆααααΎαααααΎαααΆαααααΆααααααΆααα’αα αααααΆααααΆααααααααααΆαααα
ααΆαααα Maintenance
αα ααΎααα
ααααααααααΆ - αα½αααααααΆααααααΆααααΎαα½αα
ααα½αααΆαααα API ααα»αααααα
ααααΆαα Mcs
αααααΊααΆα’αααΈααααααααααααΌααααααΆαααααΆααααΎααα
ααΌα
Mcs
:
class Mcs:
def __init__(self, id1, id2, password):
self.id1 = id1
self.id2 = id2
self.password = password
self.mcs_host = 'https://infra.mail.ru:8774/v2.1'
def vm_turn_on(self, hostname):
self.token = self.get_mcs_token()
host = self.hostname_to_vmname(hostname)
vm_id = self.get_vm_id(host)
mcs_url1 = self.mcs_host + '/servers/' + self.vm_id + '/action'
headers = {
'X-Auth-Token': '{0}'.format(self.token),
'Content-Type': 'application/json'
}
data = {'os-start' : 'null'}
mcs = requests.post(mcs_url1, data=json.dumps(data), headers=headers)
return mcs.status_code
αα
α
αααα
αΌαααααΆααααα Mcs
ααΎαααααααΆααααααααααΆαααααααααα
ααΆααααα»αααα αα·αααααααααΆααα’αααααααΎααααΆαα ααααΌα
ααΆααΆααααααααΆααααααααΆααα αα
αααα»ααα»αααΆα vm_turn_on
ααΎαβα
ααβααΎαβαααΆαααΈαβαα½αβα αααααα·ααααΆαα
ααΈαααααΊαααα»αααααΆαααααα·α
α αα
ααΎαααΌα αα»αααΆαααΈααααααααααααΌαααΆαααα α
ααΆ: 1) ααΎαααααΌαααα½αααΆααααααΆαααααΆαα 2) ααΎαααααΌαααααααα hostname αα
ααΆααααααααααααΆαααΈααααα»α MCS 3) ααα½αααΆα id αααααααΆαααΈααααα αααααΆαααα ααΎαααααΆααααααααΎααΆαααααΎαα»ααααααΆα α αΎαααΎαααααΎαααΆααααΆαααΈααααα
αααααΆα’αααΈααααα»αααΆααααααΆααααΆαααα½αααΆααααααΆαααααΆααααΎααα ααΌα α
def get_mcs_token(self):
url = 'https://infra.mail.ru:35357/v3/auth/tokens?nocatalog'
headers = {'Content-Type': 'application/json'}
data = {
'auth': {
'identity': {
'methods': ['password'],
'password': {
'user': {
'id': self.id1,
'password': self.password
}
}
},
'scope': {
'project': {
'id': self.id2
}
}
}
}
params = (('nocatalog', ''),)
req = requests.post(url, data=json.dumps(data), headers=headers, params=params)
self.token = req.headers['X-Subject-Token']
return self.token
ααααΆααβααααΎβααΆαααααααΆαβααααααααααααα·
ααααΆαααααααΆααα»αααΆααααααΆααααααΉααααααα·ααααΆααααα·ααααα·ααΆααααα½αα―αα
αααααΆα’αααΈαααααΌααααααΆααααααΆαααααααΎααα α
class Autoscaler:
def __init__(self, ambari, mcs, scaling_hosts, yarn_ram_per_node, yarn_cpu_per_node):
self.scaling_hosts = scaling_hosts
self.ambari = ambari
self.mcs = mcs
self.q_ram = deque()
self.q_cpu = deque()
self.num = 0
self.yarn_ram_per_node = yarn_ram_per_node
self.yarn_cpu_per_node = yarn_cpu_per_node
def scale_down(self, hostname):
flag1 = flag2 = flag3 = flag4 = flag5 = False
if hostname in self.scaling_hosts:
while True:
time.sleep(5)
status1 = self.ambari.decommission_nodemanager(hostname)
if status1 == 'Request accepted' or status1 == 500:
flag1 = True
logging.info('Decomission request accepted: {0}'.format(flag1))
break
while True:
time.sleep(5)
status3 = self.ambari.check_service(hostname, 'NODEMANAGER')
if status3 == 'INSTALLED':
flag3 = True
logging.info('Nodemaneger decommissioned: {0}'.format(flag3))
break
while True:
time.sleep(5)
status2 = self.ambari.maintenance_on(hostname)
if status2 == 'Request accepted' or status2 == 500:
flag2 = True
logging.info('Maintenance request accepted: {0}'.format(flag2))
break
while True:
time.sleep(5)
status4 = self.ambari.check_maintenance(hostname, 'NODEMANAGER')
if status4 == 'ON' or status4 == 'IMPLIED_FROM_HOST':
flag4 = True
self.ambari.stop_all_services(hostname)
logging.info('Maintenance is on: {0}'.format(flag4))
logging.info('Stopping services')
break
time.sleep(90)
status5 = self.mcs.vm_turn_off(hostname)
while True:
time.sleep(5)
status5 = self.mcs.get_vm_info(hostname)['server']['status']
if status5 == 'SHUTOFF':
flag5 = True
logging.info('VM is turned off: {0}'.format(flag5))
break
if flag1 and flag2 and flag3 and flag4 and flag5:
message = 'Success'
logging.info('Scale-down finished')
logging.info('Cooldown period has started. Wait for several minutes')
return message
ααΎαααα½αααααααΆαααααααΆααα
αΌαα Ambari
ΠΈ Mcs
αααααΈααααΆαααααααααΌαααΆαα’αα»ααααΆααααααΆααααΆαααααΎααΆαααααααΆα ααααΌα
ααΆαααΆαααΆααααααααααααα
ααΆααααααααααααΆααα α’αααα
αα
αΆα αα·ααααΈααΈααΌαααααΆααααα
αααα
ααααΆαααααα»α YARN α ααΆααααΆααααΆαααΆααααααααΆααααα»α 2 q_ram, q_cpu αααααΆαα½αα αααααααΎαα½αααΆααΎααααααΆαα»ααααααααααααα»αα
ααααααα
αα
α»ααααααα ααααα·αααΎααΎαααΎαααΆαααα»ααααααα 5 ααΆααΈα
α»αααααααααααΆαααΆαααΎαα‘αΎαααΆαααααααααΆαα αααααΎααααααα
ααΆααΎαααααΌαααααααααααΆαα +1 αα
αααα»αα
αααααα ααΌα
ααααΆβαααβαααβα
ααααβααααΆαααΆαβαα·αβααΆαβααααΎααααΆααβα
αααααα
ααΌαααΆαααΎααΊααΆα§ααΆα ααααααα»αααΆαααααααααΆαααΈαα
ααααΈα
ααααα α αΎαααααααααΆαα
αααα»ααααα ααΈαα½αααΊααΆαααααααα
αα YARN Nodemanager
αααααΆααααααααααΎα Maintenance
αααααΆαααα ααΎαααααααααααΆααααααΆααα’αααα
ααΎαααΆαααΈα α αΎααα·ααααΆαααΈααα·αααα·ααα
αααα»ααααα
2. ααααααΈα observer.py
ααΌαααααΌααΈααΈαααα
if scaler.assert_up(config.scale_up_thresholds) == True:
hostname = cloud.get_vm_to_up(config.scaling_hosts)
if hostname != None:
status1 = scaler.scale_up(hostname)
if status1 == 'Success':
text = {"text": "{0} has been successfully scaled-up".format(hostname)}
post = {"text": "{0}".format(text)}
json_data = json.dumps(post)
req = requests.post(webhook, data=json_data.encode('ascii'), headers={'Content-Type': 'application/json'})
time.sleep(config.cooldown_period*60)
αα
αααα»αααΆ ααΎααα·αα·αααααΎαααΆααΎααααααααααααΌαααΆααααααΎαα‘αΎααααααΆααααΆααααααΎααααααααΆαααααα
ααααα αα·αααΆααΎααΆααααΆαααΈαααΆαα½ααα
αααα»ααα»ααααα»α ααα½αααΆαααααααααΆαααΈααα½ααααα»αα
αααααα½ααα ααααααααΆαα
αααα»αα
ααααα αα·αααααααααααΆαααΆαα’αααΈααΆαα
ααΎ Slack αααααααα»αααααααΎαα αααααΆααααΈαααααΆα
αΆααααααΎα cooldown_period
αα
ααααααααΎααα·ααααααα α¬ααα’αααΈα
ααααΈα
ααααα ααα»ααααααααΆααααααΆαααΆαααααα»αα ααααα·αααΎααΆααΆααααααααΆα αα·ααααα·ααα
αααα»αα
αααααααααααααααααα»αααααα’αααααΎα αααααΎαααααΆααααααααααΆααααα½ααα·αα·αααα ααααα·αααΎααααΆαααα½ααα·ααααααααααΆαα αααααΎααααααααα½ααααα
αααααΆααααααΈαα ααααααααΎαααΆαααααααα ααΆααα»α ααΎαααΉαα αααΆααααΆααααΆαααα½αααΉααα·ααααααααααΆαααα ααΌα ααααα αΎαααΎαα αΆααααααΎαααααΆααα₯ααα·αααααααΆααα’ααααααΆαα α αΎααααααΆααΆα±ααααααααα αΌααααα α»ααααα αααααααααα ααΆααΎαα‘αΎααααααααΎαααααΈαααααααΆαααααααααΆαα
ααα ααααΈααααα·ααααΆα
Autoscaler ααΊααΆαααααααααΆαααααα’ αα·αααΆααααα½ααααααΆααααααΈααΆααααα αα ααααααα’ααααα½ααααααααΆααααα»αα ααααααα·αααααΎααααΆα α’ααααααααα ααΆαααΌαααΆαααααααα ααΆααααααααα ααααααααα ααααΆααααα»ααααααααΆαααααΆαααααΆααααΆααααα»αααααααααα»α α αΎααααα»αααααααα½ααα·αααααΌααααααΆαα»αα αααααααααααα»αα’αα‘α»αααααααα»αααααα ααααααααααααΆααα ααΆααΆααααααΎαααΆαα, ααΌαααΆααα’αααααααΎαα‘αΎααααααααααααααααα·αααααααΆαααΆαα αΌααα½αααααα’αααα autoscaler αααα½αααΆααααΆαα’αααΈαααα ααααΈαααα»αααααααΎαα ααΆαα API α’ααααααααααααα ααααα αα·α API α’ααααααααααααΆααα αααααααααααααααα ααΆααααααα·ααααΆααΆααααΆαααα½αα α’αααΈαααα’ααααα·αααΆααααΌαα αα αΆαααΊααΆααααα ααααααΆααααΆ 3 αααααα ααΌα αααααΎαααΆααααααααΈαα»αα α αΎαα’αααααΉααααααΆαα α·αααα
ααααα: www.habr.com