IHabrastatistics: uphila njani uHabr ngaphandle kwamaxesha e-geek

Molo, Habr.

Eli nqaku kukuqhubekeka okunengqiqo kokulinganisa Awona manqaku eHabr angcono ka-2018. Kwaye nangona unyaka ungakapheli, njengoko uyazi, ehlotyeni kukho utshintsho kwimigaqo, ngokufanelekileyo, kwaba mnandi ukubona ukuba oku kuchaphazela nantoni na.

IHabrastatistics: uphila njani uHabr ngaphandle kwamaxesha e-geek

Ukongeza kwizibalo zangempela, ukulinganisa okuhlaziyiweyo kwamanqaku kuya kunikwa, kunye nekhowudi ethile yomthombo kwabo banomdla kwindlela esebenza ngayo.

Kwabo banomdla kwinto eyenzekayo, ukuqhubeka kuphantsi kokusikwa. Abo banomdla kuhlalutyo olunzulu lwamacandelo esayithi banokujonga kwakhona inxalenye elandelayo.

Idatha yemvelaphi

Olu luhlu alukho semthethweni, kwaye andinalo naluphi na ulwazi lwangaphakathi. Njengoko unokubona ngokulula ngokujonga kwibar yedilesi yesikhangeli sakho, onke amanqaku akuHabré anenombolo eqhubekayo. Emva koko ngumbandela wobuchule, sifunda ngokulula onke amanqaku ngokulandelelana kumjikelo (kwintambo enye kunye nekhefu, ukuze ungalayishi umncedisi). Amaxabiso ngokwawo afunyenwe ngumcazululi olula kwiPython (imithombo iyafumaneka apha) kwaye igcinwe kwifayile ye-csv into efana nale:

2019-08-11T22:36Z,https://habr.com/ru/post/463197/,"Blazor + MVVM = Silverlight наносит ответный удар, потому что древнее зло непобедимо",votes:11,votesplus:17,votesmin:6,bookmarks:40,views:5300,comments:73
2019-08-11T05:26Z,https://habr.com/ru/news/t/463199/,"В NASA испытали систему автономного управления одного микроспутника другим",votes:15,votesplus:15,votesmin:0,bookmarks:2,views:1700,comments:7

Ukuqhubekeka

Ukwahlulahlula siya kusebenzisa iPython, iPandas kunye neMatplotlib. Abo bangenamdla kwizibalo banokutsiba le nxalenye kwaye baye ngqo kumanqaku.

Okokuqala kufuneka ulayishe idatha kwimemori kwaye ukhethe idatha yonyaka oyifunayo.

import pandas as pd
import datetime
import matplotlib.dates as mdates
from matplotlib.ticker import FormatStrFormatter
from pandas.plotting import register_matplotlib_converters


df = pd.read_csv("habr.csv", sep=',', encoding='utf-8', error_bad_lines=True, quotechar='"', comment='#')
dates = pd.to_datetime(df['datetime'], format='%Y-%m-%dT%H:%MZ')
df['datetime'] = dates
year = 2019
df = df[(df['datetime'] >= pd.Timestamp(datetime.date(year, 1, 1))) & (df['datetime'] < pd.Timestamp(datetime.date(year+1, 1, 1)))]

print(df.shape)

Kuyavela ukuba kulo nyaka (nangona ungekagqibi) ngexesha lokubhala, amanqaku ayi-12715 apapashiweyo. Ukuthelekisa, kuyo yonke i-2018 - 15904. Ngokubanzi, kuninzi - oku malunga namanqaku angama-43 ngosuku (kwaye oku kuphela ngokulinganisa okulungileyo; mangaphi amanqaku ewonke akhutshelweyo ahamba kakubi okanye acinyiwe, umntu unokuqikelela kuphela okanye uqikelelo oluqikelelwayo ukusuka kwizikhewu phakathi kweziphawuli).

Masikhethe iindawo eziyimfuneko kwiseti yedatha. Njengeemetrics siya kusebenzisa inani leembono, izimvo, amaxabiso okala kunye nenani leebhukhimakhi.

def to_float(s):
    # "bookmarks:22" => 22.0
    num = ''.join(i for i in s if i.isdigit())
    return float(num)

def to_int(s):
    # "bookmarks:22" => 22
    num = ''.join(i for i in s if i.isdigit())
    return int(num)

def to_date(dt):
    return dt.date() 

date = dates.map(to_date, na_action=None)
views = df["views"].map(to_int, na_action=None)
bookmarks = df["bookmarks"].map(to_int, na_action=None)
votes = df["votes"].map(to_float, na_action=None)
votes_up = df["up"].map(to_float, na_action=None)
votes_down = df["down"].map(to_float, na_action=None)
comments = df["comments"].map(to_int, na_action=None)

df['date'] = date
df['views'] = views
df['votes'] = votes
df['bookmarks'] = bookmarks
df['up'] = votes_up
df['down'] = votes_down

Ngoku idatha yongezwe kwi-dataset kwaye sinokuyisebenzisa. Masiqokelele idatha ngemini kwaye sithathe amaxabiso aphakathi.

g = df.groupby(['date'])
days_count = g.size().reset_index(name='counts')
year_days = days_count['date'].values
grouped = g.median().reset_index()
grouped['counts'] = days_count['counts']
counts_per_day = grouped['counts'].values
counts_per_day_avg = grouped['counts'].rolling(window=20).mean()
view_per_day = grouped['views'].values
view_per_day_avg = grouped['views'].rolling(window=20).mean()
votes_per_day = grouped['votes'].values
votes_per_day_avg = grouped['votes'].rolling(window=20).mean()
bookmarks_per_day = grouped['bookmarks'].values
bookmarks_per_day_avg = grouped['bookmarks'].rolling(window=20).mean()

Ngoku inxalenye enomdla kukuba sinokujonga iigrafu.

Makhe sijonge inani lopapasho kwiHabré ngo-2019.

import matplotlib.pyplot as plt

plt.rcParams["figure.figsize"] = (16, 8)
fig, ax = plt.subplots()

plt.bar(year_days, counts_per_day, label='Articles/day')
plt.plot(year_days, counts_per_day_avg, 'g-', label='Articles avg/day')
plt.xticks(rotation=45)
ax.xaxis.set_major_formatter(mdates.DateFormatter("%d-%m-%Y"))  
ax.xaxis.set_major_locator(mdates.MonthLocator(interval=1))
plt.legend(loc='best')
plt.tight_layout()
plt.show()

Isiphumo sinomdla. Njengoko ubona, uHabr ebeyi "sausage" encinci unyaka wonke. Andisazi isizathu.

IHabrastatistics: uphila njani uHabr ngaphandle kwamaxesha e-geek

Ukuthelekisa, i-2018 ibonakala ilula kancinci:

IHabrastatistics: uphila njani uHabr ngaphandle kwamaxesha e-geek

Ngokubanzi, andibonanga kuncipha kakhulu kwinani lamanqaku apapashiweyo ngo-2019 kwigrafu. Ngaphezu koko, ngokuchaseneyo, kubonakala kwanda kwanda kancinci ukususela ehlobo.

Kodwa iigrafu ezimbini ezilandelayo ziyandidandathekisa ngakumbi.

I-avareji yenani leembono ngenqaku ngalinye:

IHabrastatistics: uphila njani uHabr ngaphandle kwamaxesha e-geek

Ukalisho oluyi-avareji ngenqaku ngalinye:

IHabrastatistics: uphila njani uHabr ngaphandle kwamaxesha e-geek

Njengoko ubona, inani eliphakathi leembono liyancipha kancinci unyaka wonke. Oku kunokuchazwa yinyaniso yokuba amanqaku amatsha akakakhonjiswa kwiinjini zokukhangela, kwaye azifumaneki rhoqo. Kodwa ukuhla kwe-avareji yokulinganisa kwinqaku ngalinye akuqondakali ngakumbi. Imvakalelo yeyokuba abafundi abanalo ixesha lokujonga amanqaku amaninzi okanye abahoyi kumanqaku. Ngokwembono yenkqubo yokuvuza umbhali, lo mkhwa awukho mnandi kakhulu.

Ngendlela, oku akuzange kwenzeke ngo-2018, kwaye ishedyuli ingaphezulu okanye ingaphantsi.

IHabrastatistics: uphila njani uHabr ngaphandle kwamaxesha e-geek

Ngokubanzi, abanini bemithombo banento yokucinga ngayo.

Kodwa masithethe ngezinto ezibuhlungu. Ngokubanzi, sinokuthi uHabr "wasinda" ehlotyeni utshintsho ngempumelelo, kwaye inani lamanqaku kwisiza alizange linciphe.

Inqanaba

Ngoku, eneneni, ukukala. Ndivuyisana nabo bathe bangena kuyo. Makhe ndikukhumbuze kwakhona ukuba ukulinganisa akukho mthethweni, mhlawumbi kukho into endiyiphosileyo, kwaye ukuba inqaku elithile lifanele libe lapha, kodwa akunjalo, bhala, ndiya kuyongeza ngesandla. Njengomlinganiselo, ndisebenzisa iimetriki ezibaliweyo, endicinga ukuba zinomdla kakhulu.

Amanqaku aphezulu ngenani leembono

Amanqaku aphezulu ngomlinganiselo wokujonga umlinganiselo

Amanqaku aphezulu ngamagqabantshintshi kumlinganiselo wokujonga

Amanqaku aphezulu aphikisanayo

Amanqaku aphezulu ngokukala

Amanqaku aphezulu ngenani leebhukhimakhi

Phezulu ngomlinganiselo weebhukhimaksi kwiimboniselo

Amanqaku aphezulu ngenani lamagqabaza

Kwaye ekugqibeleni, eyokugqibela I-Antitop ngenani lokungathandwa

Yhuu. Ndinokhetho oluninzi olunomdla, kodwa andizukubadika abafundi.

isiphelo

Xa ndisakha ukulinganisa, ndanikela ingqalelo kwiingongoma ezimbini ezibonakala zinomdla.

Okokuqala, i-60% ephezulu ngamanqaku ohlobo lwe-"geektimes". Ingaba kuya kuba mbalwa kubo kunyaka ozayo, kwaye uHabr uya kubonakala njani ngaphandle kwamanqaku malunga nobhiya, indawo, iyeza, njl., andazi. Ngokuqinisekileyo abafundi baya kulahlekelwa yinto. Masibone.

Okwesibini, iibhukhimakhi eziphezulu zajika zaba kumgangatho ophezulu ngokungalindelekanga. Oku kuyaqondakala ngokwasengqondweni; abafundi basenokunganikeli ngqalelo kwireyithingi, kodwa ukuba inqaku isidingo, emva koko iya kongezwa kwiibhukhimaksi zakho. Kwaye nantsi eyona ngqwalasela inkulu yamanqaku aluncedo nanzulu. Ndicinga ukuba abanini sayithi kufuneka ngandlela-thile bacinge ngoqhagamshelwano phakathi kwenani leebhukhimakhi kunye neprogram yemivuzo ukuba bafuna ukwandisa olu didi oluthile lwamanqaku apha kwi-Habré.

Into efana nale. Ndiyathemba ukuba ibinolwazi.

Uluhlu lwamanqaku luye lwaba lude, kulungile, mhlawumbi lungcono. Kumnandi ukufunda wonke umntu.

umthombo: www.habr.com

Yongeza izimvo