αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž€αžΆαžšαžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python

αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž€αžΆαžšαžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python

ហេហេ!

αžαŸ’αž„αŸƒαž“αŸαŸ‡αž™αžΎαž„αž“αžΉαž„αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž§αž”αž€αžšαžŽαŸαžŸαž˜αŸ’αžšαžΆαž”αŸ‹αžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python αŸ” αž“αŸ…αž€αŸ’αž“αž»αž„αž€αžΆαžšαž•αŸ’αžαž›αŸ‹ αžŸαŸ†αžŽαž»αŸ†αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž›αžΎ Github αž…αžΌαžšαž™αžΎαž„αžœαž·αž—αžΆαž‚αž›αž€αŸ’αžαžŽαŸˆαž˜αž½αž™αž…αŸ†αž“αž½αž“ αž αžΎαž™αž”αž„αŸ’αž€αžΎαžαžŸαŸ†αžŽαž»αŸ†αž“αŸƒαž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αŸ”

αžαžΆαž˜αž”αŸ’αžšαž–αŸƒαžŽαžΈ αž“αŸ…αžŠαžΎαž˜αžŠαŸ†αž”αžΌαž„ αž…αžΌαžšαž€αŸ†αžŽαžαŸ‹αž‚αŸ„αž›αžŠαŸ…αŸ–

  • αž‘αž·αž“αŸ’αž“αž“αŸαž™αž‡αžΆαž€αŸ’αžšαž»αž˜αžαžΆαž˜αž—αŸαž‘ αž“αž·αž„αž†αŸ’αž“αžΆαŸ† αž“αž·αž„αž˜αžΎαž›αžƒαžΎαž‰αž–αžΈαžŸαž€αŸ’αžŠαžΆαž“αž»αž–αž›αžšαž½αž˜αž“αŸƒαž’αžαŸ’αžšαžΆαž€αŸ†αžŽαžΎαžαž“αŸƒαž—αŸαž‘αž‘αžΆαŸ†αž„αž–αžΈαžšαŸ”
  • αžŸαŸ’αžœαŸ‚αž„αžšαž€αžˆαŸ’αž˜αŸ„αŸ‡αž–αŸαž‰αž“αž·αž™αž˜αž”αŸ†αž•αž»αžαž‚αŸ’αžšαž”αŸ‹αž–αŸαž›;
  • αž”αŸ‚αž„αž…αŸ‚αž€αžšαž™αŸˆαž–αŸαž›αž‘αžΆαŸ†αž„αž˜αžΌαž›αž“αŸ…αž€αŸ’αž“αž»αž„αž‘αž·αž“αŸ’αž“αž“αŸαž™αž‡αžΆ 10 αž•αŸ’αž“αŸ‚αž€ αž αžΎαž™αžŸαž˜αŸ’αžšαžΆαž”αŸ‹αž“αžΈαž˜αž½αž™αŸ— αžŸαŸ’αžœαŸ‚αž„αžšαž€αžˆαŸ’αž˜αŸ„αŸ‡αž–αŸαž‰αž“αž·αž™αž˜αž”αŸ†αž•αž»αžαž“αŸƒαž—αŸαž‘αž“αžΈαž˜αž½αž™αŸ—αŸ” αžŸαž˜αŸ’αžšαžΆαž”αŸ‹αžˆαŸ’αž˜αŸ„αŸ‡αž“αžΈαž˜αž½αž™αŸ—αžŠαŸ‚αž›αž”αžΆαž“αžšαž€αžƒαžΎαž‰ αžŸαžΌαž˜αžŸαŸ’αžšαž˜αŸƒαž˜αžΎαž›αžαžΆαž˜αžœαž“αŸ’αžαžšαž”αžŸαŸ‹αžœαžΆαž‚αŸ’αžšαž”αŸ‹αž–αŸαž›αžœαŸαž›αžΆαŸ”
  • αžŸαž˜αŸ’αžšαžΆαž”αŸ‹αž†αŸ’αž“αžΆαŸ†αž“αžΈαž˜αž½αž™αŸ— αž‚αžŽαž“αžΆαž…αŸ†αž“αž½αž“αžˆαŸ’αž˜αŸ„αŸ‡αž‚αŸ’αžšαž”αžŠαžŽαŸ’αžαž”αŸ‹ 50% αž“αŸƒαž˜αž“αž»αžŸαŸ’αžŸ αž αžΎαž™αžŸαŸ’αžšαž˜αŸƒαž˜αžΎαž› (αž™αžΎαž„αž“αžΉαž„αžƒαžΎαž‰αž—αžΆαž–αžαž»αžŸαž‚αŸ’αž“αžΆαž“αŸƒαžˆαŸ’αž˜αŸ„αŸ‡αžŸαž˜αŸ’αžšαžΆαž”αŸ‹αž†αŸ’αž“αžΆαŸ†αž“αžΈαž˜αž½αž™αŸ—);
  • αž‡αŸ’αžšαžΎαžŸαžšαžΎαžŸ 4 αž†αŸ’αž“αžΆαŸ†αž–αžΈαž…αž“αŸ’αž›αŸ„αŸ‡αž–αŸαž›αž‘αžΆαŸ†αž„αž˜αžΌαž› αž“αž·αž„αž”αž„αŸ’αž αžΆαž‰αžŸαž˜αŸ’αžšαžΆαž”αŸ‹αž†αŸ’αž“αžΆαŸ†αž“αžΈαž˜αž½αž™αŸ— αž€αžΆαžšαž…αŸ‚αž€αž…αžΆαž™αžŠαŸ„αž™αž’αž€αŸ’αžŸαžšαž‘αžΈαž˜αž½αž™αž€αŸ’αž“αž»αž„αžˆαŸ’αž˜αŸ„αŸ‡ αž“αž·αž„αžŠαŸ„αž™αž’αž€αŸ’αžŸαžšαž…αž»αž„αž€αŸ’αžšαŸ„αž™αž€αŸ’αž“αž»αž„αžˆαŸ’αž˜αŸ„αŸ‡αŸ”
  • αž’αŸ’αžœαžΎαž”αž‰αŸ’αž‡αžΈαžˆαŸ’αž˜αŸ„αŸ‡αž˜αž“αž»αžŸαŸ’αžŸαž›αŸ’αž”αžΈαŸ—αž˜αž½αž™αž…αŸ†αž“αž½αž“ (αž”αŸ’αžšαž’αžΆαž“αžΆαž’αž·αž”αžαžΈ αžαžΆαžšαžΆαž…αž˜αŸ’αžšαŸ€αž„ αžαž½αžŸαž˜αŸ’αžαŸ‚αž„ αžαž½αž’αž„αŸ’αž‚αž—αžΆαž–αž™αž“αŸ’αž) αž αžΎαž™αžœαžΆαž™αžαž˜αŸ’αž›αŸƒαž₯αž‘αŸ’αž’αž·αž–αž›αžšαž”αžŸαŸ‹αž–αž½αž€αž‚αŸαž‘αŸ…αž›αžΎαžŸαž€αŸ’αžŠαžΆαž“αž»αž–αž›αž“αŸƒαžˆαŸ’αž˜αŸ„αŸ‡αŸ” αž”αž„αŸ’αž€αžΎαžαž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αŸ”

αž–αžΆαž€αŸ’αž™αžαž·αž… αž€αžΌαžŠαž€αžΆαž“αŸ‹αžαŸ‚αž…αŸ’αžšαžΎαž“!

αž αžΎαž™αžαŸ„αŸ‡αž‘αŸ…αŸ”

αž…αžΌαžšαž™αžΎαž„αžŠαžΆαž€αŸ‹αž‘αž·αž“αŸ’αž“αž“αŸαž™αž‡αžΆαž€αŸ’αžšαž»αž˜αžαžΆαž˜αž—αŸαž‘ αž“αž·αž„αž†αŸ’αž“αžΆαŸ† αž αžΎαž™αžŸαŸ’αžšαž˜αŸƒαž˜αžΎαž›αž–αžΈαžŸαž€αŸ’αžŠαžΆαž“αž»αž–αž›αžšαž½αž˜αž“αŸƒαž’αžαŸ’αžšαžΆαž€αŸ†αžŽαžΎαžαž“αŸƒαž—αŸαž‘αž‘αžΆαŸ†αž„αž–αžΈαžšαŸ–

import numpy as np
import pandas as pd 
import matplotlib.pyplot as plt

years = np.arange(1880, 2011, 3)
datalist = 'https://raw.githubusercontent.com/wesm/pydata-book/2nd-edition/datasets/babynames/yob{year}.txt'
dataframes = []
for year in years:
    dataset = datalist.format(year=year)
    dataframe = pd.read_csv(dataset, names=['name', 'sex', 'count'])
    dataframes.append(dataframe.assign(year=year))

result = pd.concat(dataframes)
sex = result.groupby('sex')
births_men = sex.get_group('M').groupby('year', as_index=False)
births_women = sex.get_group('F').groupby('year', as_index=False)
births_men_list = births_men.aggregate(np.sum)['count'].tolist()
births_women_list = births_women.aggregate(np.sum)['count'].tolist()

fig, ax = plt.subplots()
fig.set_size_inches(25,15)

index = np.arange(len(years))
stolb1 = ax.bar(index, births_men_list, 0.4, color='c', label='ΠœΡƒΠΆΡ‡ΠΈΠ½Ρ‹')
stolb2 = ax.bar(index + 0.4, births_women_list, 0.4, alpha=0.8, color='r', label='Π–Π΅Π½Ρ‰ΠΈΠ½Ρ‹')

ax.set_title('Π ΠΎΠΆΠ΄Π°Π΅ΠΌΠΎΡΡ‚ΡŒ ΠΏΠΎ ΠΏΠΎΠ»Ρƒ ΠΈ Π³ΠΎΠ΄Π°ΠΌ')
ax.set_xlabel('Π“ΠΎΠ΄Π°')
ax.set_ylabel('Π ΠΎΠΆΠ΄Π°Π΅ΠΌΠΎΡΡ‚ΡŒ')
ax.set_xticklabels(years)
ax.set_xticks(index + 0.4)
ax.legend(loc=9)

fig.tight_layout()
plt.show()

αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž€αžΆαžšαžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python

αžαŸ„αŸ‡αžŸαŸ’αžœαŸ‚αž„αžšαž€αžˆαŸ’αž˜αŸ„αŸ‡αžŠαŸ‚αž›αž–αŸαž‰αž“αž·αž™αž˜αž”αŸ†αž•αž»αžαž€αŸ’αž“αž»αž„αž”αŸ’αžšαžœαžαŸ’αžαž·αžŸαžΆαžŸαŸ’αžαŸ’αžšαŸ–

years = np.arange(1880, 2011)

dataframes = []
for year in years:
    dataset = datalist.format(year=year)
    dataframe = pd.read_csv(dataset, names=['name', 'sex', 'count'])
    dataframes.append(dataframe)

result = pd.concat(dataframes)
names = result.groupby('name', as_index=False).sum().sort_values('count', ascending=False)
names.head(10)

αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž€αžΆαžšαžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python

αž…αžΌαžšαž”αŸ‚αž„αž…αŸ‚αž€αžšαž™αŸˆαž–αŸαž›αž‘αžΆαŸ†αž„αž˜αžΌαž›αž“αŸ…αž€αŸ’αž“αž»αž„αž‘αž·αž“αŸ’αž“αž“αŸαž™αž‡αžΆ 10 αž•αŸ’αž“αŸ‚αž€ αž αžΎαž™αžŸαž˜αŸ’αžšαžΆαž”αŸ‹αž•αŸ’αž“αŸ‚αž€αž“αžΈαž˜αž½αž™αŸ— αž™αžΎαž„αž“αžΉαž„αžšαž€αžƒαžΎαž‰αžˆαŸ’αž˜αŸ„αŸ‡αž–αŸαž‰αž“αž·αž™αž˜αž”αŸ†αž•αž»αžαž“αŸƒαž—αŸαž‘αž“αžΈαž˜αž½αž™αŸ—αŸ” αžŸαž˜αŸ’αžšαžΆαž”αŸ‹αžˆαŸ’αž˜αŸ„αŸ‡αž“αžΈαž˜αž½αž™αŸ—αžŠαŸ‚αž›αž”αžΆαž“αžšαž€αžƒαžΎαž‰ αž™αžΎαž„αž˜αžΎαž›αžƒαžΎαž‰αžαžΆαž˜αžœαž“αŸ’αžαžšαž”αžŸαŸ‹αžœαžΆαž‚αŸ’αžšαž”αŸ‹αž–αŸαž›αžœαŸαž›αžΆαŸ–

years = np.arange(1880, 2011)
part_size = int((years[years.size - 1] - years[0]) / 10) + 1
parts = {}
def GetPart(year):
    return int((year - years[0]) / part_size)
for year in years:
    index = GetPart(year)
    r = years[0] + part_size * index, min(years[years.size - 1], years[0] + part_size * (index + 1))
    parts[index] = str(r[0]) + '-' + str(r[1])

dataframe_parts = []
dataframes = []
for year in years:
    dataset = datalist.format(year=year)
    dataframe = pd.read_csv(dataset, names=['name', 'sex', 'count'])
    dataframe_parts.append(dataframe.assign(years=parts[GetPart(year)]))
    dataframes.append(dataframe.assign(year=year))
    
result_parts = pd.concat(dataframe_parts)
result = pd.concat(dataframes)

result_parts_sums = result_parts.groupby(['years', 'sex', 'name'], as_index=False).sum()
result_parts_names = result_parts_sums.iloc[result_parts_sums.groupby(['years', 'sex'], as_index=False).apply(lambda x: x['count'].idxmax())]
result_sums = result.groupby(['year', 'sex', 'name'], as_index=False).sum()

for groupName, groupLabels in result_parts_names.groupby(['name', 'sex']).groups.items():
    group = result_sums.groupby(['name', 'sex']).get_group(groupName)
    fig, ax = plt.subplots(1, 1, figsize=(18,10))

    ax.set_xlabel('Π“ΠΎΠ΄Π°')
    ax.set_ylabel('Π ΠΎΠΆΠ΄Π°Π΅ΠΌΠΎΡΡ‚ΡŒ')
    label = group['name']
    ax.plot(group['year'], group['count'], label=label.aggregate(np.max), color='b', ls='-')
    ax.legend(loc=9, fontsize=11)

    plt.show()

αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž€αžΆαžšαžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python

αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž€αžΆαžšαžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python

αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž€αžΆαžšαžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python

αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž€αžΆαžšαžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python

αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž€αžΆαžšαžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python

αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž€αžΆαžšαžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python

αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž€αžΆαžšαžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python

αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž€αžΆαžšαžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python

αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž€αžΆαžšαžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python

αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž€αžΆαžšαžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python

αžŸαž˜αŸ’αžšαžΆαž”αŸ‹β€‹αž†αŸ’αž“αžΆαŸ†β€‹αž“αžΈαž˜αž½αž™αŸ— αž™αžΎαž„β€‹αž‚αžŽαž“αžΆβ€‹αž…αŸ†αž“αž½αž“β€‹αžˆαŸ’αž˜αŸ„αŸ‡β€‹αž‚αŸ’αžšαž”β€‹αžŠαžŽαŸ’αžŠαž”αŸ‹β€‹αž›αžΎβ€‹αž˜αž“αž»αžŸαŸ’αžŸ 50% αž αžΎαž™β€‹αž˜αžΎαž›β€‹αž‘αž·αž“αŸ’αž“αž“αŸαž™β€‹αž“αŸαŸ‡αŸ–

dataframe = pd.DataFrame({'year': [], 'count': []})
years = np.arange(1880, 2011)
for year in years:
    dataset = datalist.format(year=year)
    csv = pd.read_csv(dataset, names=['name', 'sex', 'count'])
    names = csv.groupby('name', as_index=False).aggregate(np.sum)
    names['sum'] = names.sum()['count']
    names['percent'] = names['count'] / names['sum'] * 100
    names = names.sort_values(['percent'], ascending=False)
    names['cum_perc'] = names['percent'].cumsum()
    names_filtered = names[names['cum_perc'] <= 50]
    dataframe = dataframe.append(pd.DataFrame({'year': [year], 'count': [names_filtered.shape[0]]}))

fig, ax1 = plt.subplots(1, 1, figsize=(22,13))
ax1.set_xlabel('Π“ΠΎΠ΄Π°', fontsize = 12)
ax1.set_ylabel('Π Π°Π·Π½ΠΎΠΎΠ±Ρ€Π°Π·ΠΈΠ΅ ΠΈΠΌΠ΅Π½', fontsize = 12)
ax1.plot(dataframe['year'], dataframe['count'], color='r', ls='-')
ax1.legend(loc=9, fontsize=12)

plt.show()

αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž€αžΆαžšαžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python

αž…αžΌαžšαž‡αŸ’αžšαžΎαžŸαžšαžΎαžŸ 4 αž†αŸ’αž“αžΆαŸ†αž–αžΈαž…αž“αŸ’αž›αŸ„αŸ‡αž–αŸαž›αž‘αžΆαŸ†αž„αž˜αžΌαž› αž αžΎαž™αž”αž„αŸ’αž αžΆαž‰αžŸαž˜αŸ’αžšαžΆαž”αŸ‹αž†αŸ’αž“αžΆαŸ†αž“αžΈαž˜αž½αž™αŸ— αž€αžΆαžšαž…αŸ‚αž€αž…αžΆαž™αžŠαŸ„αž™αž’αž€αŸ’αžŸαžšαž‘αžΈαž˜αž½αž™αž€αŸ’αž“αž»αž„αžˆαŸ’αž˜αŸ„αŸ‡ αž“αž·αž„αžŠαŸ„αž™αž’αž€αŸ’αžŸαžšαž…αž»αž„αž€αŸ’αžšαŸ„αž™αž€αŸ’αž“αž»αž„αžˆαŸ’αž˜αŸ„αŸ‡αŸ–

from string import ascii_lowercase, ascii_uppercase

fig_first, ax_first = plt.subplots(1, 1, figsize=(14,10))
fig_last, ax_last = plt.subplots(1, 1, figsize=(14,10))

index = np.arange(len(ascii_uppercase))
years = [1944, 1978, 1991, 2003]
colors = ['r', 'g', 'b', 'y']
n = 0
for year in years:
    dataset = datalist.format(year=year)
    csv = pd.read_csv(dataset, names=['name', 'sex', 'count'])
    names = csv.groupby('name', as_index=False).aggregate(np.sum)
    count = names.shape[0]

    dataframe = pd.DataFrame({'letter': [], 'frequency_first': [], 'frequency_last': []})
    for letter in ascii_uppercase:
        countFirst = (names[names.name.str.startswith(letter)].count()['count'])
        countLast = (names[names.name.str.endswith(letter.lower())].count()['count'])

        dataframe = dataframe.append(pd.DataFrame({
            'letter': [letter],
            'frequency_first': [countFirst / count * 100],
            'frequency_last': [countLast / count * 100]}))

    ax_first.bar(index + 0.3 * n, dataframe['frequency_first'], 0.3, alpha=0.5, color=colors[n], label=year)
    ax_last.bar(index + bar_width * n, dataframe['frequency_last'], 0.3, alpha=0.5, color=colors[n], label=year)
    n += 1

ax_first.set_xlabel('Π‘ΡƒΠΊΠ²Π° Π°Π»Ρ„Π°Π²ΠΈΡ‚Π°')
ax_first.set_ylabel('Частота, %')
ax_first.set_title('ΠŸΠ΅Ρ€Π²Π°Ρ Π±ΡƒΠΊΠ²Π° Π² ΠΈΠΌΠ΅Π½ΠΈ')
ax_first.set_xticks(index)
ax_first.set_xticklabels(ascii_uppercase)
ax_first.legend()

ax_last.set_xlabel('Π‘ΡƒΠΊΠ²Π° Π°Π»Ρ„Π°Π²ΠΈΡ‚Π°')
ax_last.set_ylabel('Частота, %')
ax_last.set_title('ПослСдняя Π±ΡƒΠΊΠ²Π° Π² ΠΈΠΌΠ΅Π½ΠΈ')
ax_last.set_xticks(index)
ax_last.set_xticklabels(ascii_uppercase)
ax_last.legend()

fig_first.tight_layout()
fig_last.tight_layout()

plt.show()

αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž€αžΆαžšαžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python

αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž€αžΆαžšαžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python

αž…αžΌαžšαž”αž„αŸ’αž€αžΎαžαž”αž‰αŸ’αž‡αžΈαžˆαŸ’αž˜αŸ„αŸ‡αž˜αž“αž»αžŸαŸ’αžŸαž›αŸ’αž”αžΈαŸ—αž˜αž½αž™αž…αŸ†αž“αž½αž“ (αž”αŸ’αžšαž’αžΆαž“ αžαžΆαžšαžΆαž…αž˜αŸ’αžšαŸ€αž„ αžαž½αžŸαž˜αŸ’αžαŸ‚αž„ αžαž½αž’αž„αŸ’αž‚αž—αžΆαž–αž™αž“αŸ’αž) αž αžΎαž™αžœαžΆαž™αžαž˜αŸ’αž›αŸƒαž₯αž‘αŸ’αž’αž·αž–αž›αžšαž”αžŸαŸ‹αž–αž½αž€αž‚αŸαž‘αŸ…αž›αžΎαžŸαž€αŸ’αžŠαžΆαž“αž»αž–αž›αž“αŸƒαžˆαŸ’αž˜αŸ„αŸ‡αŸ–

celebrities = {'Frank': 'M', 'Britney': 'F', 'Madonna': 'F', 'Bob': 'M'}
dataframes = []
for year in years:
    dataset = datalist.format(year=year)
    dataframe = pd.read_csv(dataset, names=['name', 'sex', 'count'])
    dataframes.append(dataframe.assign(year=year))

result = pd.concat(dataframes)

for celebrity, sex in celebrities.items():
    names = result[result.name == celebrity]
    dataframe = names[names.sex == sex]
    fig, ax = plt.subplots(1, 1, figsize=(16,8))

    ax.set_xlabel('Π“ΠΎΠ΄Π°', fontsize = 10)
    ax.set_ylabel('Π ΠΎΠΆΠ΄Π°Π΅ΠΌΠΎΡΡ‚ΡŒ', fontsize = 10)
    ax.plot(dataframe['year'], dataframe['count'], label=celebrity, color='r', ls='-')
    ax.legend(loc=9, fontsize=12)
        
    plt.show()

αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž€αžΆαžšαžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python

αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž€αžΆαžšαžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python

αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž€αžΆαžšαžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python

αž’αŸ’αžœαžΎαž€αžΆαžšαž›αžΎαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž€αžΆαžšαžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python

αžŸαž˜αŸ’αžšαžΆαž”αŸ‹αž€αžΆαžšαž αŸ’αžœαžΉαž€αž αŸ’αžœαžΊαž“ αž’αŸ’αž“αž€αž’αžΆαž…αž”αž“αŸ’αžαŸ‚αž˜αžšαž™αŸˆαž–αŸαž›αž“αŸƒαž‡αžΈαžœαž·αžαžšαž”αžŸαŸ‹αžαžΆαžšαžΆαž›αŸ’αž”αžΈαž‘αŸ…αž“αžΉαž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž–αžΈαž§αž‘αžΆαž αžšαžŽαŸαž…αž»αž„αž€αŸ’αžšαŸ„αž™ αžŠαžΎαž˜αŸ’αž”αžΈαžœαžΆαž™αžαž˜αŸ’αž›αŸƒαž™αŸ‰αžΆαž„αž…αŸ’αž”αžΆαžŸαŸ‹αž–αžΈαž₯αž‘αŸ’αž’αž·αž–αž›αžšαž”αžŸαŸ‹αž–αž½αž€αž‚αŸαž‘αŸ…αž›αžΎαžŸαž€αŸ’αžŠαžΆαž“αž»αž–αž›αž“αŸƒαžˆαŸ’αž˜αŸ„αŸ‡αŸ”

αž‡αžΆαž˜αž½αž™αž“αŸαŸ‡ αž‚αŸ„αž›αžŠαŸ…αžšαž”αžŸαŸ‹αž™αžΎαž„αž‘αžΆαŸ†αž„αž’αžŸαŸ‹αžαŸ’αžšαžΌαžœαž”αžΆαž“αžŸαž˜αŸ’αžšαŸαž… αž“αž·αž„αžŸαž˜αŸ’αžšαŸαž…αŸ” αž™αžΎαž„αž”αžΆαž“αž”αž„αŸ’αž€αžΎαžαž‡αŸ†αž“αžΆαž‰αž“αŸƒαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž§αž”αž€αžšαžŽαŸαžŸαž˜αŸ’αžšαžΆαž”αŸ‹αžŠαžΆαž€αŸ‹αž‡αžΆαž€αŸ’αžšαž»αž˜ αž“αž·αž„αž€αžΆαžšαž˜αžΎαž›αžƒαžΎαž‰αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž€αŸ’αž“αž»αž„ Python αž αžΎαž™αž™αžΎαž„αž“αžΉαž„αž”αž“αŸ’αžαž’αŸ’αžœαžΎαž€αžΆαžšαž‡αžΆαž˜αž½αž™αž‘αž·αž“αŸ’αž“αž“αŸαž™αŸ” αž˜αž“αž»αžŸαŸ’αžŸαž‚αŸ’αžšαž”αŸ‹αž‚αŸ’αž“αžΆαž’αžΆαž…αž’αŸ’αžœαžΎαž€αžΆαžšαžŸαž“αŸ’αž“αž·αžŠαŸ’αž‹αžΆαž“αžŠαŸ„αž™αž•αŸ’αž’αŸ‚αž€αž›αžΎαž‘αž·αž“αŸ’αž“αž“αŸαž™αžŠαŸ‚αž›αž˜αžΎαž›αžƒαžΎαž‰αžŠαŸ‚αž›αžαŸ’αžšαŸ€αž˜αžšαž½αž…αž‡αžΆαžŸαŸ’αžšαŸαž…αŸ”

αž…αŸ†αžŽαŸαŸ‡αžŠαžΉαž„αž‘αžΆαŸ†αž„αž’αžŸαŸ‹αž‚αŸ’αž“αžΆ!

αž”αŸ’αžšαž—αž–: www.habr.com

αž”αž“αŸ’αžαŸ‚αž˜αž˜αžαž·αž™αŸ„αž”αž›αŸ‹