Museria - decentralized music repository

Museria - decentralized music repository

Once I was going to write an application to select music for myself and listen to at home / on the street / training, etc. And for all this to work in a stream, with minimal participation from me. I came up with an architecture, sketched a prototype, and eventually ran into one “little problem”.

And it is not clear where to get the song files themselves. By this time, VKontakte had already closed the api, everything was also muffled on major music portals, even songs were given in pieces so that they would not be parsed. There were only some separate one-day sites with a ton of advertising and all kinds of garbage, all sorts of dubious grabber programs and other “dirty” options. All in all, not a single really good solution. You can, of course, buy a subscription to some Yandex music or the like. But then again, there's no open public api anywhere and you don't have access to music programmatically. Several large companies have, in fact, restricted the rest of the access to music. Why did this happen at all? Digging deeper, it became clear that the main problem was copyright. The current subscription solution suits many commercial music creators and these same companies. At the same time, non-commercial and conditionally commercial music also fall into the general list. You either pay for everything or listen to nothing at all.

And I began to think what to do with it all. How can you organize the free distribution of music? What would I do if I myself was engaged in the creation of music and would like to make money on it? Would I like it if my songs were pirated? What is the alternative solution anyway?

As a result, there are two main problems that need to be addressed:

  • Organization of the free distribution of music by methods convenient for most people, including software.
  • Offering alternatives to music creators to make money

Global decentralized music repository

Initially, I tried to find existing solutions and create everything based on this. After some time of searching, I first liked ipfs. I started to implement my idea, but after a while I found several critical problems in this solution:

  • Ipfs is storage for everything and everything. Here and images and music and video and anything. In general, such a large planetary "garbage dump". Therefore, when you launch your node, you immediately get a huge load. The car is just writhing in pain.
  • Some unfinished garbage collection mechanism. I don’t know how it is now, but at that moment, if you wrote in the config that you wanted to limit the storage to ten gigabytes of data, then it didn’t mean anything. The repository grew, ignoring many configuration options. As a result, it was necessary to have a huge reserve of a hard drive until ipfs figured out how to reset the unnecessary.
  • At the time of using the library (I don’t know how it is now), the client did not implement timeouts. You send a request to get a file, and if it doesn't exist, you just hang. Of course, people came up with all sorts of workarounds that partly solved the problem, but these were crutches. These things should be out of the box.

There were also many minor problems, the impression was unequivocal then: this cannot be used for the project. I continued to search for storage, explored different options, but did not find anything suitable.

As a result, I decided that it was worth trying to write a decentralized storage myself. Let it not pretend to be interplanetary, but it will solve a specific problem.

And so it turned out spreadable, storacle, metastocle, Museria, museria-global.

spreadable - this is the main, lowest layer, which allows you to combine nodes into a network. It contains an algorithm that I have so far partially implemented based on somewhere around 10000 servers. The full version of the algorithm is much more difficult to implement and would require several additional months (maybe more).

I won’t describe spreadable in detail in this article, I’d rather write a separate one somehow. Here I will only highlight a few features:

  • Works via http/https.
  • You can create a separate network for a specific task, which will significantly reduce the load on each individual project than if they were all in the same network.
  • The mechanism with timeouts and other trifles was initially thought out. And this works for all methods in both the client and the host. You can flexibly manage the parameters from your application.
  • The library is written in nodejs. The stack's performance issues are offset by its decentralized nature. The load can be "smeared" by increasing the number of nodes. In return, there are many advantages: a huge community, simplicity and convenience of work, an isomorphic client, no external dependencies, etc.

storacle is a layer that inherits from spreadable, which allows you to store files on the network. Each file has its own content hash, by which it can be obtained later. Files are not divided into blocks, but stored as a whole.

metastocle - a layer inherited from spreadable, which allows you to store data on the network, but not files. The interface is similar to nosql databases. You can, for example, add a file to storacle, get its hash and write it to metastocle with a link to something.

Museria - inherited from storacle and metastocle. This layer is directly responsible for storing music. The storage only works with mp3 files and id3 tags.

As a "key" to the song, its full name is used in the form Artist (TPE1) - Title (TIT2). For example:

  • Brimstone
  • Hi-rez - Lost My Way (feat. Emilio Rojas, Dani Devinci)

You can learn as much as possible about how song titles are formed. here. We need to see the function utils.beautifySongTitle().

A key match is a percentage defined in the node settings. For example, a value of 0.85 means that if the key comparison function (song names) found a similarity of more than 85%, then this is the same song.

The algorithm for determining similarity is in the same place, in the function utils.getSongSimilarity().

Cover to the song, for later receipt, also attached via tags (APIC). The utilities (utils) have all the necessary methods for getting and processing tags.

An example of working with storage through the client can be found in readme.

All of the above layers are self contained and can be used separately as lower layers for other projects. For example, already now there is an idea to make a layer for storing books.

museria-global is a pre-configured git repository for running your own node on the global music network. Clone npm i && npm start and basically everything. You can configure it in more detail, run it in docker, etc. Detailed information is available at githabe.

When the repository is updated, you need to update your node as well. If the major or minor version number changes, then this action is required, otherwise the old nodes will be ignored by the network.

You can work with songs manually and programmatically. Each node runs a server for different tasks. Including, when you visit the default endpoint, you will receive an interface for working with music. For example, you can go to root node (link may not be relevant later, input nodes can be obtained also in telegram, or see updates on github).

This is how you can search and download songs in the storage. Downloading songs can take place in two modes: normal and moderated. The second mode means that the work is carried out by a person, not a program. And if you check this box when adding, then you will need to solve the captcha. Songs can be added with priority -1, 0 or 1. Priority 1 can only be set in moderated mode. Priorities are needed so that the repository can more effectively decide what to do when you try to replace an existing song with a new one. The higher the priority, the more likely you are to overwrite an existing file. This helps fight spam and improves the quality of downloaded songs.

If you start adding songs to the repository, try to attach images (cover) as well, although this field is not required. In 99% of cases, the very first pictures in Google for song titles are album covers.

How files are added technically, in a nutshell:

  • The client receives the address of a free node, which will become the coordinator for a while.
  • The function of adding a song is triggered (by a person or by code), a request is made to add a song to the coordinator's endpoint.
  • The coordinator calculates how many duplicates to keep (configurable).
  • The most suitable nodes for saving are searched.
  • The file directly goes to these nodes.

How files are received technically:

  • The client receives the address of a free node, which will become the coordinator for a while.
  • The function of receiving a song is triggered (by a person or by code), a request for receiving is made to the coordinator's endpoint.
  • The coordinator checks if the link is in the cache. If there is one and it is working, then it is immediately returned to the client, otherwise the nodes are polled for availability.
  • The file is retrieved from the link, if one is found.

Alternatives for Music Makers

I have always been interested in the question, how can one objectively evaluate the cost of many creative works? Why, for example, does a person offer his music album for $10? Or $20 or $100. Where is the algorithm? When, for example, we are talking about a physical product, or even many types of services, we can at least calculate the cost and proceed from this.

Okay, let's say $10 bet. Is it very effective? Let's say I listened to an album somewhere or a song from there and decided to thank. But according to my feelings and my own capabilities, $ 3 is my ceiling. And how to be here? I'll probably just do nothing, like most people.

By setting some fixed price for creative work, you simply limit yourself, do not allow a large number of people to send you less money, which in total can be more impressive than those who buy at the price you set. It seems to me that creativity is exactly the area where donations should rule first of all. For this you need:

  • Teach people how to say thank you. The creators themselves must clearly show this that they would like to receive donations, add links to different payment methods everywhere, etc.
  • More mechanisms are needed to simplify and strengthen these processes. For example, create a global site where you can donate for creativity using copyright links.

    Let's say the link looks like this:

    http://someartistsdonationsite.site/category/artist?external-info

    If you narrow it down to musicians, then:

    http://someartistsdonationsite.com/music/miyagi?song=blabla

    The performer needs to verify his nickname and attach to it.

    We add the function of generating such a link to the museria client, and all projects using the repository can place donation buttons with these links next to the songs on their websites / applications. Users have the opportunity to donate very quickly and simply. Naturally, this approach can be used in any project and creative category, not just through storage.

Why do you need music storage, and how you can participate in it

  • If you are working on a project related to music, or planning to create one, then everything was conceived for the sake of this. You can use museria to store and retrieve songs, increasing the flow of songs on the web. If, at the same time, you have the ability to raise and maintain at least one of your own nodes, then this will be the best contribution to the development of the network.
  • Perhaps you are ready to take on some other role: help with the code, or fill in and moderate the database, distribute information about the project to your friends, etc.
  • Maybe you liked the idea and are ready to help financially so that it all lives and develops. The more nodes, the more songs.
  • Or you just need to find and download a song at some point. You can do it very simply, for example, through telegram bot.

The project is now in its very early stages. A test network is running, nodes may reboot frequently, require updates, etc. In the absence of critical problems during the evaluation period, the same network is transformed into the main one.

You can view information about the node from the outside: the number of songs, free space, etc., using a link like http://node-address/status or http://node-address/status?pretty

My contacts:

Source: habr.com

Add a comment