Next generation decentralized social network idea

Next generation decentralized social network idea
In this article, I present to you my reflections on the history and prospects for the development of the Internet, centralized and decentralized networks, and as a result, a possible architecture for a decentralized network of the next generation.

Something is wrong with the internet

I first got acquainted with the Internet in 2000. Of course, this is far from the very beginning - the Web already existed before that, but that time can be called the first heyday of the Internet. The World Wide Web is the ingenious invention of Tim Berners-Lee, web1.0 in its classic canonical form. Many sites and pages linking to each other with hyperlinks. At first glance - simple, like all ingenious, architecture: decentralized and free. I want to - I travel through the sites of other people, following hyperlinks; I want to - I create my own website, on which I publish what I'm interested in - for example, my articles, photos, programs, hyperlinks to sites that are interesting to me. And others post links to me.

It would seem - an idyllic picture? But you already know how it all ended.

There were too many pages, and the search for information has become a very non-trivial matter. The hyperlinks prescribed by the authors simply could not structure this huge amount of information. First there were manual directories, and then giant search engines that began to use ingenious heuristic ranking algorithms. Websites were created and abandoned, information was duplicated and distorted. The Internet was rapidly commercialized and moved further and further away from the ideal academic network. The markup language quickly evolved into a formatting language. There was advertising, nasty annoying banners and the technology of promotion and deception of search engines - SEO. The network was quickly clogged with information garbage. Hyperlinks have ceased to be a logical link tool and turned into a promotion tool. Sites were pupated, closed on themselves, turned from open "pages" into hermetic "applications", became only means of generating income.

Even then I had a certain thought that "something is not right here." A bunch of different sites, ranging from primitive home pages with a flashy appearance, and ending with "megaportals" overloaded with flickering banners. Even if the sites are on the same topic, they are not related at all, each has its own design, its own structure, annoying banners, poorly working search, download problems (yes, I wanted to have information offline). Even then, the Internet began to turn into some kind of television, where all kinds of tinsel were nailed to useful content.
Decentralization has become a nightmare.

What do you want?

Paradoxically, even then, not knowing about web 2.0 or p2p, I, as a user, did not need decentralization! Remembering my uncomplicated reflections of those times, I come to the conclusion that I needed ... single database! One, the query to which would give all the results, and not the most suitable ones for the ranking algorithm. One in which all these results would be uniformly designed and stylized by my own single design, and not by the eye-catching self-made designs of numerous Vasya Pupkins. One that could be saved offline and not be afraid that tomorrow the site will disappear and the information will be lost forever. One in which I could enter my information - for example, comments and tags. One where I could search, sort, and filter with my own personal algorithms.

Web 2.0 and social networks

Meanwhile, the concept of Web 2.0 entered the arena. Formulated in 2005 by Tim O'Reilly as "a technique for designing systems that, by taking into account network interactions, get better the more people use them" - and involves the active involvement of users in the collective creation and editing of Web content. Without exaggeration, Social Networks have become the pinnacle and triumph of this concept. Giant platforms with billions of users and hundreds of petabytes of data.

What do we get from social media?

  • interface unification; it turned out that users do not need all the possibilities for creating a variety of eye-catching design; all pages of all users have the same design and this suits everyone and is even convenient; only the content is different.
  • unification of functionality; the whole variety of scripts was also unnecessary. “Tape”, friends, albums… during the existence of social networks, their functionality has more or less stabilized and is unlikely to change: after all, the functionality is determined by the types of people’s activities, and people practically do not change.
  • single database; working with such a database turned out to be much more convenient than with many disparate sites; search has become much easier. Instead of continuously scanning various loosely connected pages, caching all this, ranking according to the most complex heuristic algorithms, there is a relatively simple unified query to a single database with a known structure.
  • feedback interface - likes and reposts; on the regular web, the same Google could not get feedback from users after clicking on a link in the search results. In social networks, this connection turned out to be simple and natural.

What have we lost? We have lost decentralization, which means freedom. It is believed that now our data does not belong to us. If earlier we could host a home page even on our own computer, now we give all our data to Internet giants.

In addition, as the Internet has developed, governments and corporations have become interested in it, and problems of political censorship and copyright restrictions have arisen. Our pages in social networks can be banned and deleted if the content does not comply with some rules of the social network; for a careless post - to bring to administrative and even criminal liability.

And here we are again thinking: should we return decentralization? But in a different form, devoid of the shortcomings of the first attempt?

Peer-to-peer networks

The first p2p networks appeared long before web 2.0 and developed in parallel with the development of the web. The main classic use of p2p is file sharing; the first networks were developed for the exchange of music. The first networks (such as Napster) were essentially centralized, and therefore they were quickly shut down by copyright holders. The followers took the path of decentralization. In 2000, the ED2K protocols (the first eDokney client) and Gnutella appear, in 2001, the FastTrack protocol (KaZaA client). Gradually, the degree of decentralization increased, technologies improved. The “download queue” systems were replaced by torrents, the concept of distributed DHT hash tables appeared. As the screws are tightened on the part of states, the anonymity of participants has become more in demand. Since 2000, the development of the Freenet network has been underway, since 2003 - I2P, in 2006 the RetroShare project was launched. We can mention numerous p2p networks, both those that existed before and have already disappeared - and those that are active now: WASTE, MUTE, TurtleF2F, RShare, PerfectDark, ARES, Gnutella2, GNUNet, IPFS, ZeroNet, Tribbler and many others. A lot of them. They are different. Very different - both in purpose and in device ... Probably, many of you are not even familiar with all these names. And that's not all.

However, p2p networks have a lot of disadvantages. In addition to the technical shortcomings inherent in each specific implementation of the protocol and client, for example, a fairly general drawback can be noted - the complexity of the search (that is, everything that Web 1.0 encountered, but in an even more complex version). There is no Google with its ubiquitous and instant search. And if you can still use the search by filename or meta-information for file-albmen networks, then finding something, for example, in onion or i2p overlay networks, is very difficult, if not impossible.

In general, if we draw analogies with the classical Internet, then most decentralized networks are stuck somewhere at the FTP level. Imagine an Internet where there is nothing but FTP: no modern sites, no web2.0, no Youtube… This is approximately the state of decentralized networks. And despite individual attempts to change something, so far there have been few changes.

Contents

Let's turn to another important piece of this puzzle - content. Content is the main problem of any Internet resource, and especially decentralized one. Where to take it from? Of course, you can rely on a bunch of enthusiasts (as is the case with existing p2p networks), but then the development of the network will be long enough, and there will be little content.

Working with the regular Internet is the search and study of content. Sometimes - saving (if the content is interesting and useful, then many, especially those who came to the Web in the days of dial-up - including me - prudently save it offline so that it does not get lost; because the Internet is a thing beyond our control, today there is a site tomorrow there is no , today there is a video on YouTube - tomorrow it was deleted, etc.

And for torrents (which we perceive more as just a means of delivery than as a p2p network), saving is generally implied. And this, by the way, is one of the problems of torrents: it is difficult to move a file downloaded once to where it is more convenient to use it (as a rule, you need to manually regenerate the distribution) and it is absolutely impossible to rename it (you can make a hardlink, but very few people know about this).

In general, many save content in one way or another. What is his future fate? Usually, saved files end up somewhere on the disk, in a folder like Downloads, in a common heap, and lie there along with many thousands of other files. This is bad - and bad for the user himself. If the Internet has search engines, then the user's local computer has nothing of the kind. It's good if the user is neat and used to sorting "incoming" downloaded files. But not all are like that...

In fact, there are now quite a few who do not save anything, but rely entirely on online. But in p2p networks, it is assumed that the content is stored locally on the user's device and distributed to other participants. Is it possible to find a solution that will allow both categories of users to be involved in a decentralized network without changing their habits, and moreover, making their lives easier?

The idea is quite simple: what if we make a means of saving content conveniently and transparently for the user from the regular Internet, and smart saving - with semantic meta-information, and not in a common heap, but in a certain structure with the possibility of further structuring, and at the same time distribute the saved content to a decentralized net?

Let's start with saving

We will not consider the utilitarian use of the Internet to view weather forecasts or flight schedules. We are more interested in self-sufficient and more or less immutable objects - articles (starting from tweets / posts on social networks and ending with large articles, like here on Habré), books, images, programs, audio and video recordings. Where does most of the information come from? Usually this

  • social networks (various news, small notes - "tweets", pictures, audio and video)
  • articles on thematic resources (such as Habr); there are not so many good resources, usually these resources are also built on the principle of social networks
  • news sites

As a rule, there are standard functions there: “like”, “repost”, “share on social networks”, etc.

Imagine some browser plugin, which will save in a special way everything that we liked, reposted, saved in “favorites” (or pressed the special plugin button displayed in the browser menu - in case the site does not have the like / repost / add to bookmarks function ). The main idea is that you just like it - as you did a million times before, and the system saves the article, picture or video in a special offline storage and this article or picture becomes available - and you can view it offline through the interface of a decentralized client , and in the most decentralized network! As for me, very convenient. No unnecessary actions, and we solve many problems at once:

  • preservation of valuable content that may be lost or deleted
  • fast filling of the decentralized network
  • aggregation of content from different sources (you can be registered in dozens of Internet resources, and all likes/reposts will flow into a single local database)
  • structuring content that interests you your rules

Obviously, the browser plug-in must be configured for the structure of each site (this is quite realistic - there are already plug-ins for saving content from Youtube, Twitter, VK, etc.). There are not so many sites for which it makes sense to make personal plugins. As a rule, these are common social networks (there are hardly more than a dozen of them) and a certain number of high-quality thematic sites like Habr (there are also few of them). With open source and specification, developing a new plugin based on a blueprint shouldn't take long. For other sites, you can use the universal save button, which would save the entire page in mhtml - perhaps by first clearing the page of ads.

Now about structuring

By “smart” saving, I mean at least saving with meta-information: content source (URL), a set of previously set likes, tags, comments, their identifiers, etc. Indeed, during normal saving, this information is lost ... The source can be understood not only as a direct URL, but also as a semantic component: for example, a group on a social network or a user who made a repost. The plugin can be smart enough to use this information for automatic structuring and tagging. Also, it should be understood that the user himself can always add some meta-information to the saved content, for which it is necessary to provide the most convenient interface tools (I have a lot of ideas on how to do this).

Thus, the issue of structuring and organizing the user's local files is solved. This is a ready-made benefit that can be used even without any p2p. Just some kind of offline database that knows what, where and in what context we have saved, and allows us to conduct small studies. For example, to find users of an external social network who liked the most under the same posts as you. Do many social networks explicitly allow this?

It should already be mentioned here that one browser plugin is of course not enough. The second most important component of the system is a decentralized network service that runs in the background and serves both the p2p network itself (requests from the network and requests from the client) and saving new content using the plugin. The service, working together with the plugin, will place the content in the right place, calculate the hashes (and possibly determine that such content has already been saved earlier), add the necessary meta-information to the local database.

Interestingly, the system would be useful already in this form, without any p2p. Many people use web clippers that add interesting content from the web to Evernote, for example. The proposed architecture is an extended version of such a clipper.

And finally, p2p exchange

The best part is that information and meta-information (both captured from the web and your own) can be exchanged. The concept of a social network is perfectly transferred to the p2p architecture. We can say that the social network and p2p seem to be made for each other. Any decentralized network should ideally be built as a social network, only then it will work effectively. "Friends", "Groups" - these are the same peers with which there should be stable connections, and those are taken from a natural source - the common interests of users.

The principles of storing and distributing content in a decentralized network are completely identical to the principles of storing (capturing) content from the regular Internet. If you use some content from the network (and therefore saved it), then anyone can use your resources (disk and channel) necessary to obtain this particular content.

Likes - the simplest tool for saving and sharing. If I like it - it doesn’t matter, on the external Internet or inside the decentralized network, then I like the content, and if so, then I’m ready to keep it locally and distribute it to other participants in the decentralized network.

  • Content is not "lost"; it is now saved locally, I can return to it later, at any time, without worrying that someone will delete it or block it
  • I can (immediately or later) categorize it, tag it, comment it, associate it with other content, generally do something meaningful with it - let's call it "metaformation"
  • I can share this meta information with other network members
  • I can sync my meta information with other members' meta information

Probably, the rejection of dislikes also looks logical: if I don’t like the content, then it’s quite logical that I don’t want to waste my disk space for storage and my Internet channel for distributing this content. Therefore, dislikes do not fit very organically into decentralization (although sometimes it does useful).

Sometimes you need to keep what you don't like. There is such a word "must" :)
«Bookmarks” (or “Favorites”) - I do not express an attitude towards the content, but I save it in my local bookmark database. The word "favorites" (favorites) does not quite fit the meaning (for this there are likes and their subsequent categorization), but "bookmarks" (bookmarks) are quite. The content in the "bookmarks" is also distributed - if you "need" it (that is, you somehow "use" it), then it is logical that it may be "needed" by someone else. Why not use your resources for this?

Quite obvious is the functionfriends". These are peers, people with similar interests, which means those who are likely to have interesting content. In a decentralized web, this primarily means subscribing to the news feed from friends and accessing their catalogs (albums) with the content they have saved.

Similarly, the function "Group”- some collective tapes, or forums, or something that you can also subscribe to - which means accepting all the materials of the group and distributing them. Perhaps "groups", like large forums, should be hierarchical - this will allow you to better structure the content of the groups, as well as limit the flow of information and not accept / distribute what is not very interesting to you.

All the rest

It should be noted that a decentralized architecture is always more complicated than a centralized one. In centralized resources - a rigid dictate of the server code. In decentralized - the need to negotiate between many equal participants. Of course, one cannot do without cryptography, blockchains and other achievements, worked out mainly on cryptocurrencies.

I assume that some cryptographic mutual trust ratings generated by network participants for each other may be required. The architecture should allow you to effectively deal with botnets, which, existing in a certain cloud, can, for example, mutually wind up ratings for themselves. I really want corporations and botnet farms, with all their technological superiority, not to take control of such a decentralized network; that its main resource should be living people capable of producing and structuring content that is interesting and useful for other living people.

I also want such a network to move civilization towards progress. On this score, I have a whole mass of ideas, which, however, does not fit into the scope of this article. Let me just say that in a certain way, scientific, technical, medical, etc. content should take precedence over entertainment, and this will require some moderation. In itself, the moderation of a decentralized network is not a trivial task, but it can be solved (although the word “moderation” is completely incorrect here and does not at all reflect the essence of the process, neither externally nor internally ... and I didn’t even think of a name for this process).

It would probably be redundant to mention the need to ensure anonymity - both by built-in means (as in i2p or Retroshare), and by passing all traffic through TOR or VPN.

And finally, the software architecture (schematically drawn in the picture for the article). As already mentioned, the first component of the system is a browser plugin that captures content with meta information. The second most important component is a p2p service that runs in the background ("backend"). The operation of the network obviously should not depend on whether the browser is running. The third component is the client software - frontend. This can be a local web service (in this case, the user will be able to work with a decentralized network without leaving his favorite browser), or a separate GUI application for a specific OS (Windows, Linux, MacOS, Andriod, iOS, etc.). I like the idea of ​​having all frontend variants at the same time. At the same time, this will oblige to a more strict backend architecture.

There are many more aspects that are not included in this article. Connection to the distribution of existing file storages (i.e. when you already have a couple of terabytes of downloaded, and you let the client scan it, get hashes, compare them with what is inside the Network and join the distribution, and at the same time get meta-information about your own files - normal names, descriptions, ratings, reviews, etc.), connection of external sources of metainformation (such as Libgen databases), optional use of disk space for storing someone else's encrypted content (as in Freenet), integration architecture with existing decentralized networks (there is generally a dark forest), the idea of ​​media hashing (the use of special perceptual hashes for media content - pictures, audio and video, which will allow comparing media files that are the same in meaning, differing in size, resolution, etc.) and much more.

Brief summary of the article

1. In decentralized networks, there is no Google with its search and ranking - but there is a Community of real people. The social network with its feedback mechanisms (likes, reposts…) and social graph (friends, communities…) is an ideal application layer model for a decentralized web
2. The main idea that I bring with this article is to automatically save interesting content from the regular Internet when you set a like / repost; this can be useful without p2p, just keeping a personal archive of interesting information
3. This content at the same time can automatically fill the decentralized network
4. The principle of automatically saving interesting content also works with likes / reposts in the most decentralized network

Source: habr.com

Add a comment