The inevitability of FPGA penetration into data centers

The inevitability of FPGA penetration into data centers
You don't have to be a chip designer to program for an FPGA, just as you don't have to be a C++ programmer to write Java code. However, in both cases, it will probably not be superfluous.

The goal of commercializing both technologies, Java and FPGA, is to refute the latter claim. The good news for FPGAs is that, with the right abstraction layers and toolset, in the last 35 years, since the invention of the programmable logic device, it has become increasingly common to create algorithms and data flows for FPGAs instead of CPUs, DSPs, GPUs, or any other form of custom ASICs. easier.

The surprising timeliness of their creation is shown in the fact that just when the CPU could no longer remain the only computing module of the data center to perform many tasks - for a variety of reasons - FPGAs achieved their efficiency, offering speed, low latency, networking capabilities and memory - heterogeneous computing capabilities of modern FPGA SoCs, which are almost complete computing systems. However, FPGAs are also successfully combined with other devices in hybrid systems, and, in our opinion, are just beginning to find their rightful place in the computing hierarchy.

Therefore, we organized The Next FPGA Platform conference in San Jose on January 22nd. Naturally, one of the main FPGA suppliers in the world and a pioneer in this area is Xilinx. Ivo Bolsens, Senior Vice President and CTO of Xilinx, gave a keynote at the conference and shared his thoughts with us today on how Xilinx is helping build scalable data center computing.

It took system architects and programmers enough time to come up with a heterogeneous data center that will feature different kinds of computing power that solve computing, storage, and networking tasks. This seems to be necessary as Moore's Law is becoming increasingly difficult to follow using various CMOS. For now, our language is still CPU-bound, and we still talk about "application acceleration," meaning how programs can run better than what can be done on CPU alone. After some time, data centers will turn into sets of computing power, data storage and protocols that bind everything together, and we will return to such terms as "computing" and "applications". Hybrid computing will become as commonplace as today's "cloud" services running on conventional or virtual machines, and at some point we will simply use the word "computing" to describe how they work. At some point - and FPGAs will probably actively contribute to this era - we will again call it data processing.

Implementing FPGAs in data centers will require a change in mindset. β€œThinking about ways to speed up today's applications is about getting to the bottom of how they run, what resources are used, what takes time,” Bolsens explains. You need to study the general problem you are trying to solve. Many of the applications running in data centers today are scaling up, taking up large amounts of resources. Take, for example, machine learning, which uses a huge number of computing nodes. But when talking about acceleration, you need to think not only about speeding up computing, but also about speeding up infrastructure.”

For example, in the machine learning operations that Bolsens studied in practice, approximately 50% of the time is spent transferring data back and forth between scattered computing facilities, and only the remaining half of the time is spent on the calculations themselves.

β€œThis is where I think the FPGA can help, as we can ensure that both the computational and communication aspects of the application are optimized. And we can do this at the overall infrastructure level, and at the chip level. This is one of the great advantages of FPGAs, allowing you to create communication networks for specific application needs. Observing the typical patterns of data movement in tasks related to the work of artificial intelligence, I do not see the need for a complex architecture based on switches. You can build a network with a large data flow. The same applies to the tasks of training neural networks - you can build a mesh network with packet sizes that adapt to a specific task. With FPGAs, you can very accurately scale and fine-tune the communication protocols and circuit topology for a particular application. And in the case of machine learning, it is also clear that we do not need double-precision floating point numbers, and we can fine-tune this too.”

The difference between an FPGA and a CPU or a custom ASIC is that the latter are programmed at the factory, and after that you can't change your mind about the types of computed data or computed elements, or the nature of the data flow going through the device. FPGAs allow you to change your mind if operating conditions change.

In the past, this advantage came at a cost when FPGA programming was not for the faint of heart. FPGA compilers need to be opened up so that they better integrate with the tools used by programmers to create parallel computing applications for CPUs in C, C++, or Python, and dedicate some of the work to libraries that speed up procedures on FPGAs. That's what the Vitis machine learning stack is at the heart of ML frameworks like Caffe and TensorFlow, with libraries to run regular AI models or add FPGA capabilities to tasks like video transcoding, video object recognition, and data analysis. , financial risk management and any third party libraries.

This concept is not much different from Nvidia's CUDA project, which started ten years ago and shifted parallel computing to GPU accelerators, or AMD's ROCm toolkit, or Intel's OneAPI project promises to run on different CPUs, GPUs, and FPGA.

The only question is how all these tools will be linked together so that any person can program a set of computing powers at their discretion. This is important because FPGAs have become more complex, much more complex than any of the available CPUs. They are manufactured with the most advanced manufacturing processes and using the most modern chip packaging technologies. And they will find their niche, because we can no longer waste time, money, energy and intelligence - all these are too expensive resources.

β€œFPGA offers technological advantages,” says Bolsens. β€œAnd it's not just the usual ad for adaptability and reconfigurability. In all important applications - machine learning, graph analysis, high-speed trading, etc. - they have the ability to adapt to a specific task not only the data distribution path, but also the memory architecture - how data moves within the chip. And there is much more memory built into the FPGA than other devices. It should also be noted that if the task does not fit into one FPGA, you can scale it to multiple chips without facing the disadvantages that await you when scaling tasks to multiple CPUs or GPUs.

Source: habr.com

Add a comment