Applied technologies on the ruins of blockchain fever or the practical benefits of resource distribution

In recent years, news feeds have been flooded with messages about a new type of distributed computing networks appearing literally out of nowhere, solving (or rather, trying to solve) a wide variety of problems - making a city smart, saving the world from copyright infringers or vice versa, secretly transferring information or resources, escaping from -under state control in one area or another. Regardless of the field, they all have a number of common features due to the fact that the fuel for their growth was the algorithms and techniques that came to the public during the recent boom in cryptocurrencies and related technologies. Probably every third article on specialized resources at that time had the word β€œblockchain” in the title - discussion of new software solutions and economic models became the dominant trend for some time, against the background of which other areas of application of distributed computing systems were relegated to the background.

At the same time, visionaries and professionals saw the main essence of the phenomenon: massive distributed computing, associated with the construction of networks from a large number of disparate and heterogeneous participants, has reached a new level of development. It is enough to throw out the hype topics from your head and look at the subject from the other side: all these networks, assembled from huge pools, which consist of thousands of isolated heterogeneous participants, did not appear on their own. Enthusiasts of the crypto movement were able to solve complex problems of data synchronization and distribution of resources and tasks in a new way, which made it possible to put together a similar mass of equipment and create a new ecosystem designed to solve one narrowly focused problem.

Of course, this did not pass by the teams and communities involved in the development of free distributed computing, and new projects were not long in coming.
However, despite the significant increase in the volume of available information about developments in the field of building networks and working with equipment, the creators of promising systems will have to solve serious problems.

The first of them, no matter how strange it may sound, is the problem of choosing a direction.

The direction may be correct, or it may lead to a dead end - there is no escape from this; centralized supplies of clairvoyants to the IT community are still late. But the choice must be made so as not to fall into the traditional trap of the team taking too broad an area and trying to create another non-specialized general distributed computing project from the start. It seems that the scope of work is not so scary, for the most part we just need to apply existing developments: combine nodes into a network, adapt algorithms for determining topologies, exchanging data and monitoring their consistency, introduce methods for ranking nodes and finding consensus, and, of course, just create your own query language and the entire language and computing environment. The idea of ​​a universal mechanism is very tempting and constantly pops up in one area or another, but the end result is still one of three things: the created solution either turns out to be a limited prototype with a bunch of suspended β€œToDos” in the backlog, or it becomes an unusable monster ready to drag away anyone who touches the fetid β€œTuring swamp”, or simply dies safely from the fact that the swan, crayfish and pike, which were pulling the project in an incomprehensible direction, simply overstrained themselves.

Let's not repeat stupid mistakes and choose a direction that has a clear range of tasks and is well suited to the distributed computing model. You can understand people who try to do everything at once - of course, there is plenty to choose from. And a lot of things look extremely interesting both from the point of view of R&D and development, and from the point of view of economics. Using a distributed network you can:

  • Train neural networks
  • Process signal streams
  • Calculate protein structure
  • Render XNUMXD scenes
  • Simulate hydrodynamics
  • Test trading strategies for stock exchanges

In order not to get carried away with compiling a list of interesting things that are well parallelized, we will choose distributed rendering as our further topic.

Distributed rendering itself is, of course, nothing new. Existing render toolkits have long supported load distribution across different machines; without this, living in the twenty-first century would be quite sad. However, you should not think that the topic has been covered far and wide, and there is nothing to do there - we will consider a separate pressing problem: creating a tool for creating a render network.

Our rendering network is a combination of nodes that need to perform rendering tasks with nodes that have free computing resources to process rendering. Resource owners will connect their stations to the render network to receive and execute render jobs using one of the network's supported render engines. In this case, task providers will work with the network as if it were a cloud, independently distributing resources, monitoring the correctness of execution, managing risks and other problems.

Thus, we will consider creating a framework that should support integration with a set of popular render engines and contain components that provide tools for organizing a network of heterogeneous nodes and managing the flow of tasks.

The economic model of the existence of such a network is not of fundamental importance, so we will take as the initial scheme a scheme similar to that used in calculations in cryptocurrency networks - consumers of the resource will send tokens to suppliers performing the rendering work. It is much more interesting to understand what properties a framework should have, for which we will consider the main scenario of interaction between network participants.

There are three sides of interaction in the network: resource provider, task provider and network operator (aka control center, network, etc. in the text).

The network operator provides the resource provider with a client application or an operating system image with a deployed set of software, which he will install on the machine whose resources he wants to provide, and a personal account accessible through the web interface, allowing him to set access parameters to the resource and remotely manage his server landscape: control hardware parameters, perform remote configuration, reboot.

When a new node is connected, the network management system analyzes the equipment and specified access parameters, ranks it, assigning a certain rating, and places it in the resource register. In the future, in order to manage the risk, the node's activity parameters will be analyzed, and the node's rating will be adjusted to ensure the stability of the network. No one will be pleased if their scene is sent to render on powerful cards that often freeze due to overheating?

A user who needs to render a scene can go two ways: upload the scene to a network repository via the web interface, or use a plugin to connect their modeling package or installed renderer to the network. In this case, a smart contract is initiated between the user and the network, the standard condition for completion of which is the generation of the result of the scene calculation by the network. The user can monitor the process of completing a task and manage its parameters through the web interface of his personal account.

The task is sent to the server, where the volume of the scene and the number of resources requested by the task initiator are analyzed, after which the total volume is decomposed into parts adapted for calculation on the number and type of resources allocated by the network. The general idea is that visualization can be broken down into many small tasks. Engines take advantage of this by distributing these tasks among multiple resource providers. The simplest way is to render small parts of the scene called segments. When each segment is ready, the local task is considered completed, and the resource moves on to the next outstanding task.

Thus, it makes no difference as such for the renderer whether the calculations are performed on a single machine or on a grid of many individual computing stations. Distributed rendering simply adds more cores to the pool of resources used for a task. Through the network, it receives all the data needed to render a segment, computes it, sends that segment back, and moves on to the next task. Before entering the general network pool, each segment receives a set of metainformation that allows executing nodes to select the most suitable computing tasks for them.

The problems of segmentation and distribution of calculations must be solved not only from the point of view of optimization of execution time, but also from the point of view of optimal use of resources and energy saving, since the economic efficiency of the network depends on this. If the solution is unsuccessful, it would be more advisable to install a miner on the node or turn it off so that it does not make noise and does not waste electricity.

However, let's get back to the process. When a task is received, a smart contract is also formed between the pool and the node, which is executed when the task result is correctly calculated. Based on the results of fulfilling the contract, the node can receive a reward in one form or another.

The control center controls the process of task execution, collecting calculation results, sending incorrect ones for re-processing and ranking the queue, monitoring the standard deadline for completing the task (so that it does not happen that the last segment is not taken up by any node).

The results of the calculations go through the compositing stage, after which the user receives the rendering results, and the network can receive a reward.

Thus, the functional composition of a landscape framework designed for building distributed rendering systems emerges:

  1. Personal user accounts with web access
  2. Software kit for installation on nodes
  3. By control system:
    • Access control subsystem
    • Rendering task decomposition subsystem
    • Task distribution subsystem
    • Compositing subsystem
    • Server landscape and network topology management subsystem
    • Logging and audit subsystem
    • Learning expert subsystem
    • Rest API or other interface for external developers

What do you think? What questions does the topic raise and what answers are you interested in?

Source: habr.com

Add a comment