Paul Graham: What I learned from Hacker News

February 2009

Hacker News turned two years old last week. Initially, it was supposed to be a side project - an application for honing Arc and a place for sharing news between the current and future founders of Y Combinator. It got bigger and took longer than I expected, but I don't regret it because I learned a lot from working on this project.

Growth

When we launched the project in February 2007, weekday traffic was around 1600 daily unique visitors. It has since increased to 22000.

Paul Graham: What I learned from Hacker News

This growth rate is slightly higher than we would like. I would like the site to grow, because if the site is not growing at least slowly, then it is probably already dead. But I wouldn't want it to reach the height of Digg or Reddit - mostly because it would weaken the character of the site, but also because I don't want to spend all my time working on scaling.

I already have enough problems with this. I remember that the initial motivation for HN was to try out a new programming language, and moreover, to try out a language that focuses on experimenting with the design of the language, not its performance. Every time the site got slow, I backed myself up by remembering the famous McIlroy and Bentley quote

The key to efficiency lies in the elegance of solutions, not in exhausting all possible options.

and looked for problem areas that I could fix with a minimum of code. I am still able to maintain the site, in terms of maintaining the same performance, despite the 14x growth. I don't know how I'm going to cope, but I'll probably come up with something.

This is my take on the site as a whole. Hacker News is an experiment, an experiment in a new field. Sites of this type are usually only a few years old. Internet discussion as such is only a few decades old. Therefore, we have probably discovered only a small part of what we will find in the end.

That's why I'm so optimistic about HN. When the technology is this new, existing solutions tend to be terrible, which means you can do something much better, which in turn means that many problems that seem intractable are actually not. Including, hopefully, a problem that plagues many communities: destruction due to growth.

Decline

Users have been worried about this since the site was only a few months old. So far, these fears have been unfounded, but this will not always be the case. Recession is a complex issue. But probably solvable; this does not mean that open conversations "always" have been wiped out by growth, when "always" means only 20 cases.

But it's important to remember that we're trying to solve a new problem because that means we have to try something new and most of it probably won't work. A couple of weeks ago, I tried to display the names of the users with the highest average comment count in orange.[1] It was a mistake. Suddenly, a culture that had been more or less unified became divided into haves and have-nots. I didn't realize how unified the culture was until I saw it divided. It was painful to watch.[2]

So orange usernames won't come back. (Sorry about that). But there will be other ideas that are just as likely to break down in the future, and those that work are likely to seem just as broken as those that aren't.

Perhaps the most important thing I've learned about decline is that it's measured in behavior rather than in users themselves. You want to eliminate bad behavior rather than bad people. User behavior is surprisingly malleable. If you are you wait from people that they will behave well, they usually do so; and vice versa.

Although, of course, prohibition of bad behavior often eliminates bad people because they feel uncomfortably limited in a place where they should behave well. This way of getting rid of them is gentler and probably more effective than others.

It's pretty clear now that the broken windows theory applies to public sites as well. The theory is that small acts of misbehavior encourage even more misbehavior: a residential area with a lot of graffiti and broken windows becomes an area where robberies often occur. I was living in New York when Giuliani introduced the reforms that made this theory famous, and the changes were amazing. And I was a Reddit user when the exact opposite happened, and the changes were just as impressive.

I'm not criticizing Steve and Alexis. What happened to Reddit was not the result of neglect. From the very beginning they had a policy of censoring only spam. In addition, Reddit had different goals than Hacker News. Reddit was a startup, not a side project; their goal was to grow as quickly as possible. Combine rapid growth and zero sponsorship and get permissiveness. But I don't think they would do anything differently if given the opportunity. Based on traffic, Reddit is much more successful than Hacker News.

But what happened to Reddit won't necessarily happen to HN. There are several local upper limits. There may be places with complete permissiveness, and there are places that are more meaningful, just like in the real world; and people will behave differently depending on where they are, just like in the real world.

I have seen this in practice. I've seen people cross-posting on Reddit and Hacker News who took the trouble to write down two versions, a hurtful message for Reddit and a more restrained version for HN.

Materials

There are two main types of problems that a site like Hacker News should avoid: bad stories and bad comments. And there seems to be less damage from bad stories. At the moment, the stories posted on the homepage are still about the same as those posted when HN was just getting started.

I once thought that I would have to think of solutions to prevent all sorts of bullshit from appearing on the main page, but so far I have not had to do this. I didn't expect the homepage to be this great and I still don't quite understand why. Perhaps only more intelligent users are attentive enough to suggest and like links, so the marginal cost per random user tends to zero. Or perhaps the home page is protecting itself by posting ads about what offers it expects.

The most dangerous thing for the main page is the material that is too easy to like. If someone proves a new theorem, the reader needs to do some work to decide if it's worth liking. A funny cartoon takes less time. Big words with no less flashy titles get zeros because people like them without even reading them.

This is what I call the False Principle: The user chooses a new site whose links are the easiest to judge unless you take specific steps to prevent it.

Hacker News has two kinds of nonsense protection. The most common types of information that have no value are banned as offtopic. Photos of kittens, diatribes of politicians, and so on are under a special ban. This cuts out most of the unnecessary nonsense, but not all. Some of the links are also rubbish, in the sense that they are very short and yet relevant material.

There is no single solution for this. If a link is just empty demagogy, editors sometimes delete it, even though it's up to date on a hacking topic, because it's not up to date by the real standard, which implies that an article should arouse intellectual curiosity. If the posts on the site are of this type, then I sometimes ban them, which means that all new material at this URL will be automatically destroyed. If a post's title contains a bait link, editors will sometimes rephrase it to make it more factual. This is especially necessary for links with flashy titles, because otherwise they become hidden by “vote if you believe this and that” posts, which is the most pronounced form of unnecessary nonsense.

The technique of dealing with such links must evolve as the links themselves evolve. The existence of aggregators has already influenced what they have brought together. Now writers consciously write something that will increase traffic through aggregators - sometimes quite specific things. (No, I have not lost the irony of this statement). There are more sinister mutations like linkjacking - publishing a retelling of someone's article and issuing it instead of the original. Something like this can get a lot of likes because it still has a lot of the good stuff in the original article; in fact, the more plagiarized the retelling is, the more good information is retained in the article. [3]

I think it's important that a site that rejects offers provide users with a way to see what's been rejected if they want to. This forces editors to be honest and, just as importantly, allows users to feel more confident, as they will know if the editors are being cunning. HN users can do this by clicking on the showdead box in their profile (“show the dead”, literally). [4]

Comments

Bad comments seem to be a bigger problem than bad suggestions. While the quality of links on the homepage hasn't changed much, the quality of the average comment has somewhat deteriorated.

There are two main types of comment corruption: rudeness and stupidity. There is a lot of overlap between these two characteristics - rude comments are probably just as stupid - but the strategies for dealing with them are different. Roughness is easier to control. You can set up rules saying that the user shouldn't be rude and if you get them to behave well then it's possible to keep the rudeness under control.

Keeping stupidity under control is more difficult, perhaps because stupidity is not so easy to distinguish. Rude people often know that they are rude, while many stupid people do not realize that they are stupid.

The most dangerous form of a stupid comment is not a long but erroneous statement, but a stupid joke. Long but erroneous statements are extremely rare. There is a strong correlation between the quality of a comment and its length; if you want to compare the quality of comments on public sites, the average comment length is a good indicator. Probably the reason is human nature, and not something specific to discuss the topic. Perhaps stupidity just takes the form of having a few ideas more often than it does wrong ideas.

Regardless of the reason, stupid comments are usually short. And since it's hard to write a short comment that differs from the amount of information it conveys, people try to stand out by trying to be funny. The most seductive format for stupid comments is supposedly witty insults, probably because insults are the easiest form of humor. [5] Therefore, one of the advantages of prohibiting rudeness is that it also eliminates such comments.

Bad comments are like kudzu: they take over quickly. Comments have a much greater effect on other comments than suggestions on new material. If someone submits an article that sucks, it doesn't make other articles fail. But if someone posts a stupid comment in a discussion, it entails a ton of the same comments in that area. People respond to dumb jokes with dumb jokes.

Perhaps the solution is to add a delay before people can reply to a comment, and the length of the delay should be inversely proportional to the perceived quality of the comment. Then stupid discussion will be less. [6]

People

I have noticed that most of the methods I have described are conservative: they are aimed at preserving the character of the site, not at improving it. I don't think I'm biased against the issue. It has to do with the shape of the problem. Hacker News got off to a good start, so in this case it's literally a matter of conservation. But I think this principle applies to sites of different origins.

The good things about community sites come from people rather than technology; technique usually comes into play when it comes to preventing bad things from happening. Technology can certainly expand the discussion. Nested comments, for example. But I'd rather use a site with primitive features and smart, nice users than a fancy site that only idiots and trolls use.

The most important thing a public site should do is attract the people it wants to have as its users. A site that tries to be as big as possible tries to appeal to everyone. But a site aimed at a certain kind of user should only attract them - and, just as importantly, repel everyone else. I deliberately tried to do this with HN. The graphic design of the site is as simple as it gets, and the site rules prevent dramatic headlines. The goal is for a person who appears on HN for the first time to be interested in the ideas expressed here.

The downside of creating a site that only targets a certain kind of user is that it can be too appealing for those users. I know full well how addictive Hacker News can be. For me, as for many users, this is a kind of virtual city square. When I want to take a break from work, I go to the square, just as I could, for example, walk along Harvard Square or University Avenue in the physical world. [7] But the area in the network is more dangerous than the real one. If I've spent half a day hanging around University Avenue, I'll notice it. I have to walk a mile to get there, and going to a coffee shop is different from working. But visiting an online forum requires only one click from you and looks very much like a job. You may be wasting your time, but you're not chilling. Someone on the internet is wrong and you fix the problem.

Hacker News is definitely a useful site. I have learned a lot from what I have read on HN. I wrote several essays that started as comments here. I don't want the site to disappear. But I want to make sure it's not a network dependency on productivity. What a terrible disaster it would be to lure thousands of smart people to a site just to waste their time. I wish I could be 100% sure this is not a description of HN.

It seems to me that addiction to games and social apps is still mostly an unresolved problem. The situation is the same as with crack in the 1980s: we have invented terrible new things that are addictive and we have not yet perfected ways to protect against them. We will improve eventually and this is one of the issues that I want to focus on in the near future.

Notes

[1] I've tried to rank users by both the average and the average number of comments, and the average (high score is discarded) seems to be a more accurate measure of high quality. Although the average number of comments may be a more accurate indicator of bad comments.

[2] Another thing I learned from this experiment is that if you're going to distinguish between people, make sure you do it right. This is the kind of problem where rapid prototyping doesn't work. Indeed, a reasonable honest argument is the fact that distinguishing between different types of people may not be the best idea. The reason is not that all people are the same, but that it is bad to make a mistake and it is difficult not to make a mistake.

[3] When I notice rude linkjacking posts, I replace the URL with whatever was copied. Sites that use linkjacking a lot get banned.

[4] Digg is infamous for his lack of clear personal identification. The root of the problem is not that the guys who own Digg are particularly secretive, but that they are using the wrong algorithm to generate their front page. Instead of ballooning from the top in the process of getting more upvotes like Reddit, stories start at the top of the page and collide down with new arrivals.

The reason for this difference is that Digg is derived from Slashdot while Reddit is derived from Delicious/popular. Digg is Slashdot with upvotes instead of editors and Reddit is Delicious/popular with upvotes instead of bookmarks. (You can still see remnants of their origins in the artwork.)

The Digg algorithm is very sensitive to games because any story that makes it to the front page is a new story. Which in turn forces Digg to resort to extreme countermeasures. Many startups have some sort of secret about what tricks they had to resort to in the early days, and I suspect that Digg's secret is that the best stories are, in fact, chosen by the editors.

[5] The dialogue between Beavis and Butthead was based largely on this and when I read comments on really bad sites I can hear their voices.

[6] I suspect that most methods for dealing with stupid comments have not yet been discovered. Xkcd has implemented the smartest method on his IRC channel: don't let him do the same thing twice. Once someone says "failure", don't let them say it again. This will penalize short comments in particular, because they have less opportunity to avoid repetition.

Another promising idea is the silly filter, which is a probabilistic spam filter but trained on silly and normal comment constructs.

It may not be necessary to destroy bad comments to get rid of the problem. Comments at the bottom of a long discussion are rarely seen, so it is sufficient to include quality prediction in the comment sorting algorithm.

[7] What makes most of the suburbs so demoralizing there is the lack of a downtown area to walk around.

Thank you Justin Kahn, Jessica Livingston, Robert Morris, Alexis Ohanian, Emmet Shear, and Fred Wilson for reading drafts.

Translation: Diana Sheremyeva
(Part of the translation taken from translated by)

Only registered users can participate in the survey. Sign in, you are welcome.

I read Hacker News

  • 36,4%Almost every day12

  • 12,1%Once a week4

  • 6,1%Once a month2

  • 6,1%Once a year2

  • 21,2%less than once a year7

  • 18,2%other6

33 users voted. 6 users abstained.

Source: habr.com

Add a comment