SHD AERODISK on domestic processors Elbrus 8C

SHD AERODISK on domestic processors Elbrus 8C

Hello Habr readers. We would like to share some very good news. We finally waited for the real serial production of a new generation of Russian Elbrus 8C processors. Officially, serial production was supposed to start as early as 2016, but, in fact, it was mass production that began only in 2019 and about 4000 processors have already been released.

Almost immediately after the start of mass production, these processors appeared in our Aerodisk, for which we would like to thank NORSI-TRANS, which kindly provided us with its hardware platform Yakhont UVM, which supports Elbrus 8C processors, for porting the software part of the storage system. This is a modern universal platform that meets all the requirements of the MCST. At the moment, the platform is used by special consumers and telecom operators to ensure the implementation of established actions during operational-search activities.

At the moment, porting has been successfully completed, and now the AERODISK storage system is available in the version with domestic Elbrus processors.

In this article, we will talk about the processors themselves, their history, architecture, and, of course, our implementation of storage systems on Elbrus.

History

The history of Elbrus processors dates back to the times of the Soviet Union. In 1973, at the Institute of Fine Mechanics and Computer Engineering named after S.A. Lebedev (named after the same Sergei Lebedev, who previously led the development of the first Soviet computer MESM, and later BESM), the development of multiprocessor computing systems called Elbrus began. Vsevolod Sergeevich Burtsev supervised the development, and Boris Artashesovich Babayan, who was one of the deputy chief designers, also took an active part in the development.

SHD AERODISK on domestic processors Elbrus 8C
Vsevolod Sergeevich Burtsev

SHD AERODISK on domestic processors Elbrus 8C
Boris Artashesovich Babayan

The main customer of the project was, of course, the armed forces of the USSR, and this series of computers was eventually successfully used in the creation of command computing centers and firing systems for missile defense systems, as well as other special-purpose systems.

SHD AERODISK on domestic processors Elbrus 8C

The first Elbrus computer was completed in 1978. It had a modular architecture and could include from 1 to 10 processors based on medium integration schemes. The speed of this machine reached 15 million operations per second. The amount of RAM, which was common to all 10 processors, was up to 2 to the 20th power of machine words or 64 MB.

Later it turned out that many of the technologies used in the development of Elbrus were studied in the world at the same time, and International Business Machine (IBM) was engaged in them, but work on these projects, unlike work on Elbrus, did not were completed and did not eventually lead to the creation of a finished product.

According to Vsevolod Burtsev, Soviet engineers tried to apply the most advanced experience of both domestic and foreign developers. The architecture of Elbrus computers was also influenced by Burroughs computers, Hewlett-Packard developments, as well as the experience of the BESM-6 developers.

But at the same time, many developments were original. The most interesting thing about Elbrus-1 was its architecture.

The created supercomputer became the first computer in the USSR that used superscalar architecture. The mass use of superscalar processors abroad began only in the 90s of the last century with the appearance on the market of affordable Intel Pentium processors.

In addition, special input-output processors could be used to organize the transfer of data streams between peripheral devices and RAM in a computer. There could be up to four such processors in the system, they worked in parallel with the central processor and had their own dedicated memory.

Elbrus-2

In 1985, Elbrus received its logical continuation, the Elbrus-2 computer was created and sent into mass production. In terms of architecture, it did not differ much from its predecessor, but used a new element base, which made it possible to increase the overall performance by almost 10 times - from 15 million operations per second to 125 million. The amount of computer RAM increased to 16 million 72-bit words or 144 MB. The maximum bandwidth of the Elbrus-2 I / O channels was 120 MB / s.

"Elbrus-2" was actively used in nuclear research centers in Chelyabinsk-70 and in Arzamas-16 in the MCC, in the A-135 missile defense system, as well as at other military facilities.

The creation of Elbrus was duly appreciated by the leaders of the Soviet Union. Many engineers were awarded orders and medals. General Designer Vsevolod Burtsev and a number of other specialists received state awards. And Boris Babayan was awarded the Order of the October Revolution.

These awards are more than well-deserved, Boris Babayan later said:

“In 1978, we made the first superscalar machine, Elbrus-1. Now in the West they make superscalars of this architecture only. The first superscalar appeared in the West in 92, ours in 78. Moreover, the version of the superscalar that we made is similar to the Pentium Pro that Intel made in 95.”

These words about the historical superiority are also confirmed in the USA, Keith Diefendorff, the developer of the Motorola 88110, one of the first Western superscalar processors, wrote:

“In 1978, almost 15 years before the first Western superscalar processors appeared, Elbrus-1 used a processor, with the issuance of two instructions in one cycle, changing the order of instruction execution, renaming registers and executing by assumption.”

Elbrus-3

It was 1986, and almost immediately after the completion of work on the second Elbrus, ITMiVT began developing a new Elbrus-3 system using a fundamentally new processor architecture. Boris Babayan called this approach “post-superscalar”. It was this architecture, later called VLIW / EPIC, that in the future (in the mid-90s) Intel Itanium processors began to use (and in the USSR these developments started in 1986 and ended in 1991).

In this computing complex, the ideas of explicit control of the parallelism of operations with the help of a compiler were first implemented.

In 1991, the first and, unfortunately, the only Elbrus-3 computer was released, which could not be fully adjusted, and after the collapse of the Soviet Union, no one needed it, and the developments and plans remained on paper.

Background to the new architecture

The team that worked at ITMiVT on the creation of Soviet supercomputers did not break up, but continued to work as a separate company under the name MCST (Moscow Center for SPARK-Technologies). And in the early 90s, active cooperation between MCST and Sun Microsystems began, where the MCST team took part in the development of the UltraSPARC microprocessor.

It was during this period that the E2K architecture project arose, which was originally funded by Sun. Later, the project became completely independent and all intellectual property for it remained with the MCST team.

“If we continued to work with Sun in this area, then everything would belong to Sun. Even though 90% of the work was done before Sun came along.” (Boris Babayan)

E2K architecture

When we discuss the architecture of Elbrus processors, very often we hear the following statements from our colleagues in the IT industry:

"Elbrus is a RISC architecture"
"Elbrus is EPIC architecture"
"Elbrus is SPARC-architecture"

In fact, none of these statements is entirely true, or if it is, it is only partially true.

The E2K architecture is a separate original processor architecture, the main qualities of E2K are energy efficiency and excellent scalability, achieved by specifying explicit parallelism of operations. The E2K architecture was developed by the MCST team and is based on a post-superscalar architecture (a la EPIC) with some influence from the SPARC architecture (with a RISC past). At the same time, MCST was directly involved in the creation of three of the four basic architectures (Superscalars, Post-Superscalars and SPARC). The world is really small.

To avoid confusion in the future, we have drawn a simple diagram that, although simplified, but very clearly shows the roots of the E2K architecture.

SHD AERODISK on domestic processors Elbrus 8C

Now a little more about the name of the architecture, in relation to which there is also a misunderstanding.

In various sources, you can find the following names for this architecture: "E2K", "Elbrus", "Elbrus 2000", ELBRUS ("ExpLicit Basic Resources Utilization Scheduling", i.e. explicit planning for the use of basic resources). All these names speak of the same thing - about the architecture, but in the official technical documentation, as well as on technical forums, the name E2K is used to designate the architecture, so in the future, if we are talking about processor architecture, we use the term "E2K", and if about a specific processor, then we use the name "Elbrus".

Technical features of the E2K architecture

In traditional architectures such as RISC or CISC (x86, PowerPC, SPARC, MIPS, ARM), the processor receives a stream of instructions that are designed for sequential execution. The processor can detect independent operations and run them in parallel (superscalar) and even change their order (out of order). However, dynamic dependency analysis and support for out-of-order execution has its limitations in terms of the number of commands launched and analyzed per cycle. In addition, the corresponding blocks inside the processor consume a significant amount of energy, and their most complex implementation sometimes leads to stability or security problems.

In the E2K architecture, the main job of analyzing dependencies and optimizing the order of operations is taken by the compiler. The processor receives the so-called. wide instructions, each of which encodes instructions for all processor executive devices that must be launched at a given clock cycle. The processor is not required to analyze dependencies between operands or swap operations between wide instructions: the compiler does all this based on source code analysis and processor resource planning. As a result, the processor hardware can be simpler and more economical.

The compiler is able to parse the source code much more thoroughly than the processor's RISC/CISC hardware and find more independent operations. Therefore, the E2K architecture has more parallel execution units than traditional architectures.

Current features of the E2K architecture:

  • 6 channels of arithmetic logic units (ALU) operating in parallel.
  • Register file of 256 84-bit registers.
  • Hardware support for cycles, including those with pipelining. Increases the efficiency of processor resource usage.
  • Programmable asynchronous data prepump with separate readout channels. Allows you to hide delays from memory access and make fuller use of the ALU.
  • Support for speculative calculations and one-bit predicates. Allows you to reduce the number of transitions and execute several branches of the program in parallel.
  • A wide command capable of specifying up to 23 operations in one clock cycle with maximum filling (more than 33 operations when packing operands into vector instructions).

SHD AERODISK on domestic processors Elbrus 8C

Emulation x86

Even at the architecture design stage, the developers understood the importance of supporting software written for the Intel x86 architecture. For this, a system was implemented for dynamic (i.e., during program execution, or “on the fly”) translation of x86 binary codes into E2K architecture processor codes. This system can work both in application mode (in the manner of WINE), and in a mode similar to a hypervisor (then it is possible to run the entire guest OS for the x86 architecture).

Thanks to several levels of optimization, it is possible to achieve high speed of the translated code. The quality of x86 architecture emulation is confirmed by the successful launch of more than 20 operating systems (including several versions of Windows) and hundreds of applications on Elbrus computing systems.

Protected Program Execution Mode

One of the most interesting ideas inherited from the Elbrus-1 and Elbrus-2 architectures is the so-called secure program execution. Its essence is to ensure that the program only works with initialized data, to check all memory accesses for belonging to a valid address range, to provide inter-module protection (for example, to protect the calling program from an error in the library). All these checks are performed in hardware. For protected mode, there is a full-fledged compiler and runtime support library. At the same time, it should be understood that the imposed restrictions lead to the impossibility of organizing execution, for example, code written in C ++.

Even in the usual, "unprotected" mode of operation of the Elbrus processors, there are features that increase the reliability of the system. Thus, the binding information stack (the chain of return addresses for procedure calls) is separate from the user data stack and is inaccessible to such attacks used in viruses as return address spoofing.

Designed over the years, it not only catches up and outperforms competing architectures in terms of performance and scalability in the future, but also provides protection against bugs that plague x86/amd64. Bookmarks like Meltdown (CVE-2017-5754), Specter (CVE-2017-5753, CVE-2017-5715), RIDL (CVE-2018-12126, CVE-2018-12130), Fallout (CVE-2018-12127), ZombieLoad (CVE-2019-11091) and the like.

Modern protection against found vulnerabilities in the x86/amd64 architecture is based on patches at the operating system level. That is why the performance drop on current and previous generations of processors of these architectures is so noticeable and ranges from 30% to 80%. We, as active users of x86 processors, know about this, suffer and continue to “eat a cactus”, but the presence of a solution to these problems in the bud for us (and, as a result, for our customers) is an undoubted benefit, especially if the solution is Russian.

Technical specifications

Below are the official technical characteristics of the Elbrus processors of the past (4C), current (8C), new (8CB) and future (16C) generations in comparison with similar Intel x86 processors.

SHD AERODISK on domestic processors Elbrus 8C

Even a cursory glance at this table shows (and this is very pleasing) that the technological backlog of domestic processors, which seemed insurmountable 10 years ago, already now seems quite small, and in 2021 with the launch of Elbrus-16C (which, among other things, will support virtualization) will be reduced to the minimum distances.

SHD AERODISK on Elbrus 8C processors

We pass from theory to practice. As part of the strategic alliance of MCST, Aerodisk, Basalt SPO (formerly Alt Linux) and NORSI-TRANS, a data storage system was developed and put into operation, which at the moment is if not the best in terms of security, functionality, cost and performance , in our opinion, an undeniably worthy solution that can ensure the proper level of technological independence of our Motherland.
Now the details...

Hardware

The hardware part of the storage system is implemented on the basis of the universal platform Yakhont UVM of the NORSI-TRANS company. The Yakhont UVM platform received the status of telecommunications equipment of Russian origin and is included in the unified register of Russian radio-electronic products. The system consists of two separate storage controllers (2U each), which are interconnected by a 1G or 10G Ethernet interconnect, as well as with shared disk shelves using a SAS connection.

Of course, this is not as beautiful as the “Cluster in a box” format (when controllers and disks with a common backplane are installed in one 2U chassis) that we usually use, but in the near future it will also be available. The main thing here is that it works well, but we’ll think about the “bows” later.

SHD AERODISK on domestic processors Elbrus 8C

SHD AERODISK on domestic processors Elbrus 8C

Under the hood, each controller has a single-processor motherboard with four RAM slots (DDR3 for an 8C processor). Also on board each controller there are 4 1G Ethernet ports (two of which are used by AERODISK ENGINE software as service) and three PCIe slots for Back-end (SAS) and Front-end (Ethernet or FibreChannel) adapters.

As boot disks, we use Russian SATA SSD disks from GS Nanotech, which we have repeatedly tested and used in projects.

SHD AERODISK on domestic processors Elbrus 8C

When we first met the platform, we carefully examined it. We had no questions about the quality of assembly and soldering, everything was done neatly and reliably.

Operating system

The version of OS Alt 8SP for certification is used as the OS. In the near future, we plan to create a pluggable and constantly updated repository for Alt OS with Aerodisk storage software.

This version of the distribution is built on the current stable version of the Linux 4.9 kernel for E2K (a branch with long-term support ported by MCST specialists), supplemented with patches for functionality and security. All packages in Alt OS are built directly on Elbrus using the original transactional build system of the ALT Linux Team project, which made it possible to reduce labor costs for the transfer itself and pay more attention to product quality.

Any release of Alt OS for Elbrus can be significantly expanded in terms of functionality using the repository available for it (from about 6 thousand source packages for the eighth version to about 12 for the ninth).

The choice was also made because Basalt SPO, the developer of Alt OS, is actively working with other software and device developers on various platforms, ensuring seamless interaction within hardware and software systems.

Software Storage systems

When porting, we immediately abandoned the idea of ​​using the x2 emulation supported in E86K, and began to work with processors directly (fortunately, Alt already has the necessary tools for this).

Among other things, the native execution mode provides better security (the same three hardware stacks instead of one) and increased performance (there is no need to allocate one or two cores out of eight for the binary translator to work, and the compiler does its job better than JIT).

In fact, the E2K implementation of AERODISK ENGINE supports most of the existing storage functionality found in x86. The current version of AERODISK ENGINE (A-CORE version 2.30) is used as the storage system software

Without any problems on E2K, the following functions were introduced and tested for use in the product:

  • Fault tolerance for up to two controllers and multipath I/O (mpio)
  • Block and file access with thin volumes (RDG, DDP pools; FC, iSCSI, NFS, SMB protocols including Active Directory integration)
  • Various RAID levels up to triple parity (including the ability to use the RAID constructor)
  • Hybrid storage (combining SSD and HDD within the same pool, i.e. cache and tiering)
  • Space saving options with deduplication and compression
  • ROW snapshots, clones and various replication options
  • And other small but useful features such as QoS, global hotspare, VLAN, BOND, etc.

In fact, on E2K we managed to get all our functionality, except for multi-controllers (more than two) and the multi-threaded I / O scheduler, which allows us to increase the performance of all-flash pools by 20-30%.

But we, of course, will also add these useful functions, a matter of time.

A little about performance

After successfully passing the tests of the basic functionality of the storage system, we, of course, began to perform load tests.

For example, on a dual-controller storage system (2xCPU E8C 1.3 Ghz, 32 GB RAM + 4 SAS SSD 800GB 3DWD), in which the RAM cache was disabled, we created two DDP pools with the main RAID-10 level and two 500G LUNs and connected these LUNs over iSCSI (10G Ethernet) to a Linux host. And did one of the basic hourly tests on small sequential load blocks using the FIO program.

The first results were quite positive.

SHD AERODISK on domestic processors Elbrus 8C

The load on the processors was on average at the level of 60%, i.e. this is the base level at which storage can safely work.

Yes, this is far from highload, and this is clearly not enough for high-performance DBMSs, but, as our practice shows, these characteristics are sufficient for 80% of the general tasks for which storage systems are used.

A little later, we plan to return with a detailed report on the load tests of Elbrus as a storage platform.

Bright future

As we wrote above, the mass production of Elbrus 8C actually started just recently - at the beginning of 2019 and by December about 4000 processors had already been released. For comparison, only 4 processors of the previous generation Elbrus 5000C were produced for the entire period of their production, so there is progress.

It is clear that this is a drop in the ocean, even for the Russian market, but the road will be mastered by the walking one.
The release of several tens of thousands of Elbrus 2020C processors is planned for 8, and this is already a serious figure. In addition, during 2020, the Elbrus-8SV processor should be brought by the MCST team to mass production.

Such production plans are an application for a very significant share of the entire domestic server processor market.

As a result, here and now we have a good and modern Russian processor with a clear and, in our opinion, correct development strategy, on the basis of which there is the most secure and certified Russian-made data storage system (and in the future, a virtualization system on Elbrus-16C). The Russian system is as far as it is now physically possible in modern conditions.

We often see in the news the next epic failures of companies that proudly call themselves Russian manufacturers, but in fact are engaged in re-gluing labels without adding any value of their own to the products of a foreign manufacturer, except for their markup. Such companies, unfortunately, cast a shadow on all real Russian developers and manufacturers.

With this article, we want to clearly show that in our country there were, are and will be companies that really and efficiently make modern complex IT systems and are actively developing, and import substitution in IT is not a profanity, but a reality in which we all live. You can not love this reality, you can criticize it, or you can work and make it better.

SHD AERODISK on domestic processors Elbrus 8C

The collapse of the USSR at one time prevented the team of Elbrus creators from becoming a prominent player in the world of processors and forced the team to seek funding for their developments abroad. It was found, the work was done, and the intellectual property was saved, for which I would like to say a huge thank you to these people!

That's all for now, please write your comments, questions and, of course, criticism. We are always happy.

Also, on behalf of the entire Aerodisk company, I want to congratulate the entire Russian IT community on the upcoming New Year and Christmas, wish 100% uptime - and that backups will not be useful to anyone in the new year))).

Materials used

An article with a general description of technologies, architectures and personalities:
https://www.ixbt.com/cpu/e2k-spec.html

A brief history of computers under the name "Elbrus":
https://topwar.ru/34409-istoriya-kompyuterov-elbrus.html

General article about e2k architecture:
https://ru.wikipedia.org/wiki/%D0%AD%D0%BB%D1%8C%D0%B1%D1%80%D1%83%D1%81_2000

The article is about the 4th generation (Elbrus-8S) and the 5th generation (Elbrus-8SV, 2020):
https://ru.wikipedia.org/wiki/%D0%AD%D0%BB%D1%8C%D0%B1%D1%80%D1%83%D1%81-8%D0%A1

Specifications of the next 6th generation of processors (Elbrus-16SV, 2021):
https://ru.wikipedia.org/wiki/%D0%AD%D0%BB%D1%8C%D0%B1%D1%80%D1%83%D1%81-16%D0%A1

The official description of the architecture of Elbrus:
http://www.elbrus.ru/elbrus_arch

The plans of the developers of the hardware and software platform "Elbrus" to create a supercomputer with exascale performance:
http://www.mcst.ru/files/5a9eb2/a10cd8/501810/000003/kim_a._k._perekatov_v._i._feldman_v._m._na_puti_k_rossiyskoy_ekzasisteme_plany_razrabotchikov.pdf

Russian Elbrus technologies for personal computers, servers and supercomputers:
http://www.mcst.ru/files/5472ef/770cd8/50ea05/000001/rossiyskietehnologiielbrus-it-edu9-201410l.pdf

An old article by Boris Babayan, but still relevant:
http://www.mcst.ru/e2k_arch.shtml

Old article by Mikhail Kuzminsky:
https://www.osp.ru/os/1999/05-06/179819

MCST presentation, general information:
https://yadi.sk/i/HDj7d31jTDlDgA

Information about Alt OS for the Elbrus platform:
https://altlinux.org/эльбрус

https://sdelanounas.ru/blog/shigorin/

Source: habr.com

Add a comment