I-Habrastatistics: ukuthi u-Habr uphila kanjani ngaphandle kwezikhathi ze-geek

Sawubona, Habr.

Lesi sihloko siwukuqhubeka kwesilinganiso okunengqondo Izindatshana ezinhle kakhulu ze-Habr zango-2018. Futhi nakuba unyaka ungakapheli, njengoba wazi, ehlobo kwakukhona izinguquko emithethweni, ngokufanele, kwaba mnandi ukubona ukuthi lokhu kuthinte noma yini.

I-Habrastatistics: ukuthi u-Habr uphila kanjani ngaphandle kwezikhathi ze-geek

Ngokungeziwe kwizibalo zangempela, kuzonikezwa isilinganiso esibuyekeziwe sama-athikili, kanye nekhodi ethile yomthombo yalabo abathanda ukuthi isebenza kanjani.

Kulabo abanentshisekelo ngokwenzekile, ukuqhubeka kungaphansi kwe-cut. Labo abathanda ukuhlaziya okuningiliziwe kwezigaba zesayithi bangabheka ingxenye elandelayo.

Idatha ebomvu

Lesi silinganiso asikho emthethweni, futhi anginalo ulwazi lwangaphakathi. Njengoba ungabona kalula ngokubheka ibha yekheli lesiphequluli sakho, zonke izindatshana eziku-Habré zinezinombolo eziqhubekayo. Khona-ke kuyindaba yesu, simane sifunde zonke izindatshana ngokulandelana emjikelezweni (ngentambo eyodwa kanye nokumisa okwesikhashana, ukuze singalayishi iseva). Amanani ngokwawo atholwe ngumhlaziyi olula kuPython (imithombo iyatholakala lapha) futhi ilondolozwe kufayela le-csv into efana nale:

2019-08-11T22:36Z,https://habr.com/ru/post/463197/,"Blazor + MVVM = Silverlight наносит ответный удар, потому что древнее зло непобедимо",votes:11,votesplus:17,votesmin:6,bookmarks:40,views:5300,comments:73
2019-08-11T05:26Z,https://habr.com/ru/news/t/463199/,"В NASA испытали систему автономного управления одного микроспутника другим",votes:15,votesplus:15,votesmin:0,bookmarks:2,views:1700,comments:7

Iyacubungula

Ukuze sihlukanise sizosebenzisa iPython, Pandas kanye neMatplotlib. Labo abangenayo intshisekelo ngezibalo bangeqa le ngxenye futhi baqonde ngqo ezihlokweni.

Okokuqala udinga ukulayisha idathasethi kumemori bese ukhetha idatha yonyaka oyifunayo.

import pandas as pd
import datetime
import matplotlib.dates as mdates
from matplotlib.ticker import FormatStrFormatter
from pandas.plotting import register_matplotlib_converters


df = pd.read_csv("habr.csv", sep=',', encoding='utf-8', error_bad_lines=True, quotechar='"', comment='#')
dates = pd.to_datetime(df['datetime'], format='%Y-%m-%dT%H:%MZ')
df['datetime'] = dates
year = 2019
df = df[(df['datetime'] >= pd.Timestamp(datetime.date(year, 1, 1))) & (df['datetime'] < pd.Timestamp(datetime.date(year+1, 1, 1)))]

print(df.shape)

Kuvele ukuthi kulo nyaka (yize ubungakaqedwa) ngesikhathi sokuloba, zingu-12715 izindatshana ezishicilelwe. Ukuze uqhathanise, kuwo wonke unyaka ka-2018 - 15904. Ngokuvamile, okuningi - lokhu kumayelana nezihloko ezingama-43 ngosuku (futhi lokhu kuphela ngesilinganiso esihle; zingaki ingqikithi yezihloko ezilandiwe ezingahambanga kahle noma ezisusiwe, umuntu angaqagela kuphela noma cishe ukulinganisa kusuka ezikhaleni phakathi kwezihlonzi).

Masikhethe izinkambu ezidingekayo kudathasethi. Njengamamethrikhi sizosebenzisa inombolo yokubuka, amazwana, amanani okulinganisa kanye nenani lamabhukhimakhi.

def to_float(s):
    # "bookmarks:22" => 22.0
    num = ''.join(i for i in s if i.isdigit())
    return float(num)

def to_int(s):
    # "bookmarks:22" => 22
    num = ''.join(i for i in s if i.isdigit())
    return int(num)

def to_date(dt):
    return dt.date() 

date = dates.map(to_date, na_action=None)
views = df["views"].map(to_int, na_action=None)
bookmarks = df["bookmarks"].map(to_int, na_action=None)
votes = df["votes"].map(to_float, na_action=None)
votes_up = df["up"].map(to_float, na_action=None)
votes_down = df["down"].map(to_float, na_action=None)
comments = df["comments"].map(to_int, na_action=None)

df['date'] = date
df['views'] = views
df['votes'] = votes
df['bookmarks'] = bookmarks
df['up'] = votes_up
df['down'] = votes_down

Manje idatha yengeziwe kudathasethi futhi singayisebenzisa. Masiqoqe idatha ngosuku futhi sithathe amanani amaphakathi.

g = df.groupby(['date'])
days_count = g.size().reset_index(name='counts')
year_days = days_count['date'].values
grouped = g.median().reset_index()
grouped['counts'] = days_count['counts']
counts_per_day = grouped['counts'].values
counts_per_day_avg = grouped['counts'].rolling(window=20).mean()
view_per_day = grouped['views'].values
view_per_day_avg = grouped['views'].rolling(window=20).mean()
votes_per_day = grouped['votes'].values
votes_per_day_avg = grouped['votes'].rolling(window=20).mean()
bookmarks_per_day = grouped['bookmarks'].values
bookmarks_per_day_avg = grouped['bookmarks'].rolling(window=20).mean()

Manje ingxenye ethakazelisayo ukuthi singabheka amagrafu.

Ake sibheke inani lokushicilelwe ku-Habré ngo-2019.

import matplotlib.pyplot as plt

plt.rcParams["figure.figsize"] = (16, 8)
fig, ax = plt.subplots()

plt.bar(year_days, counts_per_day, label='Articles/day')
plt.plot(year_days, counts_per_day_avg, 'g-', label='Articles avg/day')
plt.xticks(rotation=45)
ax.xaxis.set_major_formatter(mdates.DateFormatter("%d-%m-%Y"))  
ax.xaxis.set_major_locator(mdates.MonthLocator(interval=1))
plt.legend(loc='best')
plt.tight_layout()
plt.show()

Umphumela uyathakazelisa. Njengoba ubona, uHabr ube “isoseji” elincane unyaka wonke. Angazi isizathu.

I-Habrastatistics: ukuthi u-Habr uphila kanjani ngaphandle kwezikhathi ze-geek

Uma uqhathanisa, i-2018 ibukeka bushelelezi kancane:

I-Habrastatistics: ukuthi u-Habr uphila kanjani ngaphandle kwezikhathi ze-geek

Ngokuvamile, angizange ngikubone ukwehla okukhulu kwenani lezindatshana ezishicilelwe ngo-2019 kugrafu. Ngaphezu kwalokho, ngokuphambene nalokho, kubonakala sengathi kwanda ngisho kancane kusukela ehlobo.

Kodwa amagrafu amabili alandelayo angicindezela kancane.

Isilinganiso senani lokubuka i-athikili ngayinye:

I-Habrastatistics: ukuthi u-Habr uphila kanjani ngaphandle kwezikhathi ze-geek

Isilinganiso esimaphakathi nge-athikili ngayinye:

I-Habrastatistics: ukuthi u-Habr uphila kanjani ngaphandle kwezikhathi ze-geek

Njengoba ubona, isilinganiso senani lokubuka siyehla kancane unyaka wonke. Lokhu kungachazwa yiqiniso lokuthi ama-athikili amasha awakakhonjiswa izinjini zokusesha, futhi awatholakali njalo. Kodwa ukwehla kwesilinganiso esimaphakathi nge-athikili ngayinye akuqondakali kakhulu. Umuzwa uwukuthi abafundi abanaso isikhathi sokubheka izihloko eziningi kangaka noma abanaki izilinganiso. Ngokombono wohlelo lokuklomelisa umbhali, lo mkhuba awujabulisi neze.

Ngendlela, lokhu akuzange kwenzeke ngo-2018, futhi isimiso singaphezulu noma ngaphansi.

I-Habrastatistics: ukuthi u-Habr uphila kanjani ngaphandle kwezikhathi ze-geek

Ngokuvamile, abanikazi bezinsiza banokuthile abangacabanga ngakho.

Kodwa masingakhulumi ngezinto ezibuhlungu. Ngokuvamile, singasho ukuthi uHabr "wasinda" ehlobo lishintsha ngempumelelo, futhi inani lezindatshana esizeni alizange lehle.

Isilinganiso

Manje, empeleni, isilinganiso. Ngiyabahalalisela abangene kuwo. Ake ngikukhumbuze futhi ukuthi ukulinganisa akukho emthethweni, mhlawumbe ngiphuthelwe okuthile, futhi uma i-athikili ethile kufanele ibe lapha, kodwa akunjalo, bhala, ngizoyengeza mathupha. Njengesilinganiso, ngisebenzisa amamethrikhi abaliwe, engicabanga ukuthi avele athakasela kakhulu.

Izindatshana eziphezulu ngenani lokubukwa

Izindatshana eziphezulu ngesilinganiso sokubuka isilinganiso

Izindatshana eziphezulu ngokwesilinganiso samazwana nokubukwa

Ama-athikili aphezulu anempikiswano

Izihloko eziphezulu ngokulinganisa

Izindatshana eziphezulu ngenani lamabhukhimakhi

Phezulu ngesilinganiso samabhukumaka ekubukweni

Izindatshana eziphezulu ngenani lamazwana

Futhi ekugcineni, owokugcina I-Antitop ngenombolo yokungathandwa

Hhawu. Nginezinketho ezimbalwa ezithakazelisayo, kodwa ngeke ngibathukuthelise abafundi.

isiphetho

Lapho ngakha isilinganiso, nganaka amaphuzu amabili ayebonakala ethakazelisa.

Okokuqala, ama-60% aphezulu ama-athikili ohlobo lwe-"geektimes". Ukuthi bazoba mbalwa yini ngonyaka ozayo, nokuthi u-Habr uzobukeka kanjani ngaphandle kwezihloko ezimayelana nobhiya, indawo, umuthi, njll., angazi. Nakanjani, abafundi bazolahlekelwa okuthile. Asibone.

Okwesibili, amabhukhimakhi aphezulu avele awekhwalithi ephezulu ngokungalindelekile. Lokhu kuyaqondakala ngokwengqondo; abafundi bangase banganaki isilinganiso, kodwa uma isihloko isidingo, ngemva kwalokho izongezwa kumabhukhimakhi akho. Futhi nakhu ngokunembile ukugxila okukhulu kwama-athikili awusizo nabucayi. Ngicabanga ukuthi abanikazi besayithi kufanele ngandlela thile bacabange ngoxhumano phakathi kwenombolo yamabhukumaka nohlelo lwemiklomelo uma befuna ukukhulisa lesi sigaba sama-athikili lapha ku-Habré.

Into efana nale. Ngethemba ukuthi bekunolwazi.

Uhlu lwama-athikili lube lude, mhlawumbe lungcono. Kujabulele ukufunda wonke umuntu.

Source: www.habr.com

Engeza amazwana