The city falls asleep, Khabrovites wake up

If the number of comments under an article is rapidly approaching 1000, rest assured that regardless of the topic stated by the author, a squabble is raging inside: flashpoints of politics, surrounded by armchair experts on all issues, psychiatric diagnoses at a distance by avatar and nickname, getting personal, sarcastic attacks, the causticity of which exceeds that of the blood of xenomorphs, and, of course, the obligatory dish in such cases is mutual accusations that your counterpart is discussing with you solely for remuneration or out of duty. Which, apparently, is dangerous and difficult, and at first glance seems not to be visible, and thirty pieces of silver are not lying on the road.

The funny thing about this situation is thatthat people deeply affected by the syndrome on-the-internet-someone-is-wrong, often spend a damn lot of time and nerves to for free prove to another equally amazed person that he is doing exactly the same thing for money or by order. Are you looking for logic here? She's gone. It's the internet, baby.

Let's take one of relatively fresh shit about alleged territorial discrimination on Gitlab. 4 days have passed since the publication of the article and, of course, the discussion has long since moved far away from the originally stated topic. The following phrases sound:

A real person will not be able to oppose anything to a professional commentator on a subscription...

User (so-and-so) spends an unrealistic amount of time on comments...
Moreover, its activity does not have patterns that are usually characteristic of an ordinary user....

ps but this gave me the idea to write a parser-analyzer for such commentators) With an indication of activity by hour, amount of time per day, per week, etc... A good topic for an article)

Okay, stop. What kind of patterns are “usually inherent to the average user”? The author of this phrase in that thread, unfortunately, has already been transcribed, so you’ll have to go at random.

The question that I want to put before your clear eyes is the following: is it even possible, using statistical methods, to at least reliably identify these same patterns so as to create a formal classifier that distinguishes casual from professional commentators? Imagine - “according to Habra-botometer, you are 76% likely to be a Kremlinbot.” This will be much cooler than karmic raids on each other.
Unfortunately, my competencies are not enough to even suggest which direction to dig in to solve such a problem. However, last night I hacked together a small primitive parser, which (fortunately pages with comments are open even to unauthorized visitors) so far does two things - a) collects statistics from a given username of all his comments (for now just time -stamp) and adds it to the MySQL database; b) draws a time diagram, marking on it the events of comment sending taken from this database. Even without any sophisticated analysis it turned out to be quite funny. This is what my comment chart looks like. Explanations are below. It is best to view it in a separate window at a scale of 100% or more.

The city falls asleep, Khabrovites wake up

The horizontal axis is time, each pixel is equal to one minute, the value of the gray divisions is equal to one hour, the entire horizontal line is equal to one day. The days go from bottom to top along the vertical axis, the division value on it is 365 days.

There is nothing particularly interesting in my diagram. It can be seen that I like to sleep 7-8 hours, often go to bed after midnight, and sometimes have hours-long commenting marathons, and that activity over the past year is greater than or approximately equal to that over the previous five years.
Or here's a comrade cube I kept a vow of silence for three and a half years, and then it broke through...

The city falls asleep, Khabrovites wake up

The activity diagram of a typical habracommentator looks something like this (this is QtRoS)

The city falls asleep, Khabrovites wake up

A distinct “sleepy hollow” on the left somewhere in the European night and leisurely commentary during daylight hours, perhaps with breaks for half a year.

But not all diagrams are so boring! How about this, for example:

The city falls asleep, Khabrovites wake up

In just over two years, our colleague apparently retrained his biorhythms to sleep from the European night somewhere under the Mid-Atlantic Ridge, evenly and gradually, and then spent another two years to return to the shores of Portugal. Did you walk? Swim? I can’t come up with plausible explanations... For the first three hours of being awake, comments fly like a machine gun, but at the end of the day it’s like that, once every hour I look in to see what’s going on there and that’s it.

By the way, it was 0xd34df00d.

And here's another riddle:

The city falls asleep, Khabrovites wake up

The colleague lasted four and a half years without a single comment - apparently he was training somewhere in secret monasteries on how to stay awake for days, judging by how many comments were posted in “sleepy hollow.”

But the most interesting thing here is the anomaly at the 16th hour, which lasts for more than three years and gradually fades away in the last year. Smoke break? Walking the dog? Jogging? What else can tear a Khabrov resident away from the comments feed in the midst of a working day with such daily predetermination? I'm a slob and a lazy person, I can't imagine the kind of self-discipline that the respected khim.

Finally, one last diagram to think about:

The city falls asleep, Khabrovites wake up

There is no clearly defined “sleepy hollow” on it at all. Only one can barely discern the visible excess in the number of comments sent after noon over those sent before.

With all Komsomol rigor I urge the respected MTyrz disarm yourself in front of the party and honestly admit how many grandparents, granddaughters, bugs and mice rule your account and write comments.

And finally, an insidious question - could someone be so interested in all this that they would want to develop the parser code or get a database dump or access to it, and so on? My own knowledge of data mining and data visualization methods hardly exceeds general erudition. I can hardly think of anything smarter and more interesting than these simple diagrams. If anyone is interested, write to me in telegram (nickname in profile).

Thank you for attention!

UPD. Posted it sources on GitHub.

Source: habr.com

Add a comment