About the oddities of habrostatistics

I used to notice the strange behavior of the ratings, but recently the strangeness has manifested itself too clearly. And I decided to investigate the problem using the scientific methods available to me, namely: to analyze the dynamics of plus-minus. Suddenly dreamed?

I'm still a programmer, but I can do very basic things. So I coded a simple utility that collects statistics from the panels of the Khabrovsky post: pros, cons, views, bookmarks, and more.

About the oddities of habrostatistics

The statistics are displayed in graphs, after studying which we managed to find a couple more surprises, smaller ones. But first things first.

Weirdness 1.
This is where my statistical research began.

It seemed strange to me that in the first hours after the publication of some of my posts, they sharply went into the red, then reset to zero and eventually earned the expected plus. Why did it happen?

I was just about to publish another post - in two parts. He decided to subject it to statistical preparation.

Published the first part. At the same time, I launched the utility and began to wait for the result. Unfortunately, at night - while I was sleeping - the program stopped collecting information due to a bug. The next morning I corrected the error, but the statistics turned out to be for less than a day. However, the trends are obvious and for the hours worked.

The data are given for the first 14 hours from the moment of publication, the interval between measurements is 10 minutes.

About the oddities of habrostatistics

The eyes did not deceive: most of the minuses fall on the first hour of the existence of the post. At first, the post went sharply into the negative, then straightened out. Here are the numbers on which the graph is built:

About the oddities of habrostatistics

And this despite the fact that the views are increasing smoothly!

About the oddities of habrostatistics

The steps starting from the thousandth values ​​are explained by the fact that reductions begin in the Khabrov panel: there is nowhere to take the exact number of views (probably, it could have been taken from third-party services, but I did not use them).

I am not a specialist in statistics, but such a distribution of minuses is abnormal, as far as I understand?!

Look, the bookmarks are more or less evenly distributed over the registration period:

About the oddities of habrostatistics

Comments are also uniform:

About the oddities of habrostatistics

There are bursts of activity and passivity, but they are also distributed over the period: commenting either fades out, then resumes.

The same with subscribers - there is a uniform slight increase:

About the oddities of habrostatistics

Karma for the reporting period has not changed - I do not cite it. And the rating is calculated by Habr, it makes no sense to bring it.

All indicators change in proportion to the number of views, and only with the minuses something is wrong: an outburst of anger occurs in the first hour from the start of the publication. The same was observed with my previous posts. But if earlier these were, so to speak, personal impressions, now they have been confirmed by registration.

In my purely noobish opinion, such a distribution means: there are several users on the site who purposefully view freshly published posts and some of the posts - based on a need known only to them - are downvoted. I am writing “some of the posts” because I noticed this effect not only in my publications. In all cases, the effect is pronounced, otherwise I simply would not have paid attention to it.

I have four versions of why this happens.

Version 1. Psychic perversion. Sick people specially guard the authors unpleasant to them and minus, in order to harm.

I don't believe in this version.

Version 2. Psychological effect. Which one, I don't know. Well, why do readers first unanimously minus the post, then no less unanimously upvote? Minus as non-thematic, but plus after connoisseurs of beauty are in the majority? I do not know.

If there are psychologists among the readers, let them say their weighty word.

Version 3. Servants are active. Why should their superiors spread rot on Khabrov's posts - God knows. However, there are servicemen not only in our country. Who will understand them, Russophobes?!

Version 4. Combined impact of the previously mentioned factors.

Quite conceivable.

Be that as it may, the minusers manage to reduce the number of views. I’m not familiar with the rules for bringing Khabrov’s posts to the top, I don’t even know if these algorithms have been made public or not, but it’s obvious to me: early minus does not allow ostracized posts to reach the top - more precisely, it delays getting there, which in turn is significant, in times, reduces the number of views.

As far as I understand, there are no effective ways to combat this evil. The only way is by name voting. Only in this case it is possible to establish from which profiles the periodic tracking and minus of fresh posts comes from. However, there is no nominal voting on Habré (or rather, it is not made public).

But not all so simple.

As I said, the dissected material was published in parts. After the publication of the second part, I expected a similar picture: with an initial exit in the red and then in the black. However, the effect turned out to be much smoother: the post did not go into the red.

By the time the second part was published, the bug was fixed, so the data is given for the day:

About the oddities of habrostatistics

Where the smoothing came from, I do not know. Perhaps because of the publication on Saturday (the downvoters don't work on Saturdays?) or because this is the end of previously published material.

However, the distribution of minuses is still uneven: all the minuses fall on the first half of the registration period, and the minus ends much earlier than the plus. At the same time, views are distributed over the period exactly as last time - evenly:

About the oddities of habrostatistics

The jump around XNUMX pm is not classified material. My internet just went down for an hour. The utility could not connect to the site.

About the oddities of habrostatistics

Everything else is completely standard.

Bookmarks:

About the oddities of habrostatistics

Comments: Like last time, periods of activity alternate with periods of silence.

About the oddities of habrostatistics

Karma. An increase of a couple of units was fixed - of course, not simultaneous:

About the oddities of habrostatistics

And subscribers. The total number remained unchanged (apparently, those who wished signed up when the first part was published). Only about one o'clock in the afternoon there was a single fluctuation: someone unsubscribed - perhaps by mistake - but immediately signed up again. If it was another person, there was a compensation: the total number of subscribers did not change.

About the oddities of habrostatistics

So, fasting indicators behave in a clear and predictable way. All indicators, except for the minuses. Since I don't see an obvious reason for this, I find the minus peak to be at least odd.

Weirdness 2.
Sometimes the number of views decreases (which, of course, is impossible), but soon returns to normal.

I traced it by chance, during the debugging of the program, when the export-import function had not yet been attached, so there is no corresponding zigzag on the chart. You can take my word for it - this effect was observed twice. Several thousand views, suddenly the number of views decreases by a couple of hundred, after 10-20 minutes it is restored to its previous level (without taking into account the natural increase).

With this it is quite simple: a bug on the site. And there is nothing to think.

Weirdness 3.
That's what seemed to me much more strange than the voluntaristic first and technical second effects. Pluses do not happen singly, with a uniform distribution over the period, but in blocks. But after all, adding is not a comment, when a question is naturally followed by an answer, they are an individual act!

Take a closer look at the result graphs published above: the blocks are noticeable.

Knowledgeable people nodded to me at the Poisson distribution, but I am not able to calculate the probability on my own. If you can, count. It is already obvious to me that the number of double pluses is much higher than the norm.

Here is the digital data on the pluses of the first part of the post. The graph shows the number of pluses attributable to single, double and triple positions in the total number of ratings. As mentioned earlier, the sampling interval is 10 minutes.

About the oddities of habrostatistics

Out of 30 pokes in 84 cells, two cells were poked three times. Well, I don’t know how this corresponds to the theory of probability ...

Data for the second part of the post (since the measurement period is longer, I shorten it by the duration of the first part, for comparability):

About the oddities of habrostatistics

By the way, here one of the single pluses is adjacent in time to the tripled one, that is, in some 20 minutes there was a surge of pluses (29% of their total number of pluses were set). And this did not happen in the first minutes of publication.

The ratio between single, double and triple positions is approximately the same as for the first part. And the decrease in the share of marks in measurements is explained by the fact that marks were given less frequently. Measurements were made, but no positives were recorded.

I cannot explain this effect of block plushing in any way, that is, in no way at all. For cons, such “blocky” behavior seems to be not typical.

Do the emitters of goodness send suggestions in portions, turning on and off? Hehehehe…

PS
If anyone wants to analyze post statistics with more advanced methods or check arithmetic, the source data files are here:
yadi.sk/d/iN4SL6tzsGEQxw

I do not insist on my doubts - perhaps I am wrong, especially since there is no belmes in statistics. I hope that the comments of professional statisticians, psychologists and other interested users will clear up the confusion.

Thank you for attention.

Source: habr.com

Add a comment