TON: Telegram Open Network. Part 2: Blockchains, sharding

TON: Telegram Open Network. Part 2: Blockchains, sharding

This text is a continuation of a series of articles in which I consider the structure of the (presumably) distributed network Telegram Open Network (TON), which is being prepared for release this year. IN previous section I described its most basic level - the way the nodes interact with each other.

Just in case, let me remind you that I have nothing to do with the development of this network and all the material is drawn from an open (albeit unverified) source - document (there is also an attached brochure, summarizing the main points), which appeared at the end of last year. The amount of information in this document, in my opinion, indicates its authenticity, although there is no official confirmation of this.

Today we will look at the main component of TON - the blockchain.

Basic Concepts

Account (account). A set of data identified by a 256-bit number account_id (most often this is the public key of the account owner). In the base case (see below zero workchain), this data means the user's balance. "Occupy" a specific account_id anyone can, but you can change its value only according to certain rules.

Smart contract (smart contract). In fact, it is a special case of an account, supplemented with a smart contract code and a storage of its variables. If in the case of a “wallet” it is possible to credit and withdraw money from it according to relatively simple and predetermined rules, then in the case of a smart contract, these rules are written in the form of its code (in some Turing-complete programming language).

Blockchain state (state of blockchain). The set of states of all accounts / smart contracts (in the abstract sense, a hash table, where the keys are account identifiers, and the values ​​are the data stored in the accounts).

Message (message). Above, I used the expression “credit and write off money” - this is a particular example of a message (“transfer N grams from account account_1 to account account_2"). Obviously, only the node that owns the private key of the account can send such a message. account_1 - and able to confirm it with a signature. The result of delivering such messages to a regular account is an increase in its balance, and to a smart contract - the execution of its code (which will process the receipt of the message). Of course, other messages are also possible (transferring not monetary amounts, but arbitrary data between smart contracts).

Transaction (transaction). The fact of delivering a message is called a transaction. Transactions change the state of the blockchain. It is from transactions (message delivery records) that the blocks in the blockchain consist. In this regard, you can think of the state of the blockchain as an incremental database - all blocks are “diffs” that need to be applied sequentially to get the current state of the database. The specifics of packing these "diffs" (and restoring the full state using them) will be discussed in the next article.

Blockchain in TON: what is it and why?

As mentioned in the previous article, Blockchain is a data structure, the elements (blocks) of which are arranged in a "chain", and each next block of the chain contains a hash of the previous one.. The question was asked in the comments: why do we need such a data structure at all when we already have a DHT - a distributed hash table? Obviously, some data can be stored in DHT, but this is only suitable for not too “sensitive” information. Cryptocurrency balances cannot be stored in DHT - primarily due to the lack of checks for integrity. Actually, the entire complexity of the blockchain structure grows in order to prevent interference with the data stored in it.

However, the blockchain in TON looks even more complicated than in most other distributed systems - and there are two reasons for this. The first is the desire to minimize the need for forks. In traditional cryptocurrencies, all parameters are set at the initial stage, and any attempt to change them actually leads to the emergence of an “alternative cryptocurrency universe”. The second reason is support for crushing (sharding, sharding) blockchain. Blockchain is a structure that cannot get smaller over time; and usually each node responsible for the health of the network is forced to store it completely. In traditional (centralized) systems, sharding is used to solve such problems: some of the records in the database are located on one server, some on another, and so on. In the case of cryptocurrencies, such functionality is still quite rare - in particular, due to the fact that it is difficult to add sharding to a system where it was not originally planned.

How does TON plan to solve both of the above problems?

Blockchain content. Workchains.

TON: Telegram Open Network. Part 2: Blockchains, sharding

First of all, let's talk about what is planned to be stored in the blockchain. The states of accounts (“wallets” in the base case) and smart contracts (for simplicity, we will assume that this is the same as accounts) will be stored there. In fact, this will be a regular hash table - the keys in it will be identifiers account_id, and values ​​are data structures containing things like:

  • balance;
  • smart contract code (only for smart contracts);
  • smart contract data storage (only for smart contracts);
  • statistics;
  • (optional) public key for transfers from an account, account_id by default;
  • queue of outgoing messages (here they are entered for forwarding to the recipient);
  • a list of the latest messages delivered to this account.

As mentioned above, the blocks themselves consist of transactions - messages delivered to various account_id accounts. However, in addition to the account_id, messages also contain a 32-bit field workchain_id - identifier of the so-called. workchain (workchain, working blockchain). This allows you to have several independent blockchains with different configurations. In this case, workchain_id = 0 is considered a special case, zero workchain - it is the balances in it that will correspond to the TON (Grams) cryptocurrency. Most likely, at first, other workchains will not exist at all.

Shardchains. Infinite Sharding Paradigm.

But the growth in the number of blockchains does not stop there. Let's deal with sharding. Imagine that each account (account_id) has its own blockchain - it contains all the messages it receives - and the states of all such blockchains are stored on separate nodes.

Of course, this is very wasteful: most likely, in each of these shardchains (shardchain, shard blockchain) transactions will be received very rarely, and a lot of powerful nodes will be needed (looking ahead, I note that we are talking not just about clients on mobile phones - but about serious servers).

Therefore, shardchains combine accounts by binary prefixes of their identifiers: if a shardchain has a prefix of 0110, then transactions of all account_id that start with these numbers will fall into it. This shard_prefix can have a length from 0 to 60 bits - and most importantly, it can change dynamically.

TON: Telegram Open Network. Part 2: Blockchains, sharding

As soon as too many transactions begin to flow into one of the shardchains, the nodes working on it, according to predetermined rules, “split” it into two child ones - their prefixes will be one bit longer (and for one of them this bit will be 0, and for the other - 1). For example, shard_prefix = 0110b splits into 01100b and 01101b. In turn, if two “neighboring” shardchains begin to feel at ease enough (for some time), they will merge again.

Thus, sharding is done "from the bottom up" - we assume that each account has its own shard, but they are - for the time being - "glued" by prefixes. This is what it means Infinite Sharding Paradigm (infinite sharding paradigm).

Separately, I would like to emphasize that workchains exist only virtually - in fact, workchain_id it is part of the identifier of a particular shardchain. In formal terms, each shardchain is defined by a pair of numbers (workchain_id, shard_prefix).

Error correction. Vertical blockchains.

It is traditionally believed that any transaction in the blockchain is “set in stone”. However, in the case of TON, it is possible to “rewrite history” - in case someone (the so-called. knot-"fisherman") will prove that one of the blocks was signed incorrectly. In this case, a special corrective block is added to the corresponding shardchain, containing the hash of the corrected block itself (and not the last block in the shardchain). Representing the shardchain as a chain of blocks laid out horizontally, we can say that the corrective block is attached to the erroneous block not to the right, but from above - therefore, it is considered that it becomes part of a small “vertical blockchain”. Thus, it can be said that shardchains are two-dimensional blockchains.

TON: Telegram Open Network. Part 2: Blockchains, sharding

If, after an erroneous block, subsequent blocks referenced the changes made by it (i.e., new transactions were made based on invalid ones), corrective ones are also added to these blocks “from above”. If the blocks did not affect the "affected" information, these "corrective waves" do not apply to them. For example, in the illustration above, the transaction of the first block, which increased the balance of account C, was recognized as incorrect - therefore, the transaction that reduced the balance of this account in the third block must also be canceled, and a corrective block was committed over the block itself.

It should be noted that although the corrective blocks are depicted as located “above” the original ones, in fact they will be added to the end of the corresponding blockchain (where they should be located chronologically). The two-dimensional arrangement only shows to which point in the blockchain they will be "hooked" (through the hash of the original block located in them).

You can separately philosophize about how good the decision to "change the past" is. It would seem that if we allow the possibility of the appearance of an incorrect block in the shardchain, then we cannot prevent the possibility of the appearance of an erroneous corrective block. Here, as far as I can tell, the difference in the number of nodes that must reach a consensus on new blocks - relatively small will work on each shardchain "working group» nodes (quite often changing its composition), and the introduction of corrective blocks will require the consent of all validator nodes. I'll cover validators, workgroups, and other node roles in more detail in a future article.

One blockchain to rule them all

The above lists a lot of information about the different types of blockchains, which in itself should also be stored somewhere. In particular, we are talking about the following information:

  • about the number and configurations of workchains;
  • about the number of shardchains and their prefixes;
  • about which nodes are currently responsible for which shardchains;
  • hashes of the last blocks added to all shardchains.

As you might have guessed, all these things are recorded in another blockchain storage - masterchain (masterchain, master blockchain). Due to the presence of hashes from the blocks of all shardchains in its blocks, it makes the system highly connected. Among other things, this means that the generation of a new block in the masterchain will occur immediately after the generation of blocks in the shardchains - it is expected that blocks in the shardchains will appear almost simultaneously approximately every 5 seconds, and the next block in the masterchain - a second after that.

But who will be responsible for the implementation of all this titanic work - for sending messages, executing smart contracts, forming blocks in shardchains and the masterchain, and even checking blocks for errors? Will the phones of millions of users with the Telegram client installed on them really do all this on the sly? Or, perhaps, the Durov team will abandon the ideas of decentralization and their servers will do it the old fashioned way?

In fact, neither one nor the other answer is correct. But the fields of this article are rapidly ending, so we will talk about the various roles of nodes (you may have already noticed some of them mentioned), as well as the mechanics of their work, in the next part.

Source: habr.com

Add a comment