RoadRunner: PHP is not built to die, or Golang to the rescue

RoadRunner: PHP is not built to die, or Golang to the rescue

Hey Habr! We are active at Badoo working on PHP performance, since we have a fairly large system in this language and the performance issue is a money saving issue. More than ten years ago, we created PHP-FPM for this, which at first was a set of patches for PHP, and later entered the official distribution.

In recent years, PHP has made great progress: the garbage collector has improved, the level of stability has increased - today you can write daemons and long-lived scripts in PHP without any problems. This allowed Spiral Scout to go further: RoadRunner, unlike PHP-FPM, does not clean up memory between requests, which gives an additional performance gain (although this approach complicates the development process). We are currently experimenting with this tool, but we don't have any results yet to share. To make waiting for them more fun, we publish the translation of the RoadRunner announcement from Spiral Scout.

The approach from the article is close to us: when solving our problems, we also most often use a bunch of PHP and Go, getting the benefits of both languages ​​and not abandoning one in favor of the other.

Enjoy!

In the last ten years, we have created applications for companies from the list F, and for businesses with an audience of no more than 500 users. All this time, our engineers have been developing the backend mainly in PHP. But two years ago, something had a big impact not only on the performance of our products, but also on their scalability - we introduced Golang (Go) into our technology stack.

Almost immediately, we discovered that Go allowed us to build larger applications with up to 40x performance improvements. With it, we were able to extend our existing PHP products, improving them by combining the benefits of both languages.

We will tell you how the combination of Go and PHP helps to solve real development problems and how it has turned into a tool for us that can get rid of some of the problems associated with PHP dying model.

Your daily PHP development environment

Before we talk about how you can use Go to revive the PHP dying model, let's take a look at your default PHP development environment.

In most cases, you run your application using a combination of the nginx web server and the PHP-FPM server. The former serves static files and redirects specific requests to PHP-FPM, while PHP-FPM itself executes PHP code. You may be using the less popular combination of Apache and mod_php. But although it works a little differently, the principles are the same.

Let's take a look at how PHP-FPM executes application code. When a request comes in, PHP-FPM initializes a PHP child process and passes the details of the request as part of its state (_GET, _POST, _SERVER, etc.).

The state cannot change during the execution of a PHP script, so the only way to get a new set of input data is by clearing the process memory and initializing it again.

This execution model has many advantages. You don't have to worry too much about memory consumption, all processes are completely isolated, and if one of them "dies", it will be automatically recreated and it will not affect the rest of the processes. But this approach also has disadvantages that appear when trying to scale the application.

Disadvantages and Inefficiencies of a Regular PHP Environment

If you are a professional PHP developer, then you know where to start a new project - with the choice of a framework. It consists of dependency injection libraries, ORMs, translations and templates. And, of course, all user input can conveniently be put into one object (Symfony/HttpFoundation or PSR-7). Frameworks are cool!

But everything has its price. In any enterprise-level framework, to process a simple user request or access to a database, you will have to load at least dozens of files, create numerous classes and parse several configurations. But the worst thing is that after completing each task, you will need to reset everything and start over: all the code you just initiated becomes useless, with its help you will no longer process another request. Tell this to any programmer who writes in some other language, and you will see bewilderment on his face.

PHP engineers have been looking for ways to solve this problem for years, using clever lazy loading techniques, microframeworks, optimized libraries, cache, etc. But in the end, you still have to reset the entire application and start over, again and again. (Translator's note: this problem will be partially solved with the advent of preload in PHP 7.4)

Can PHP with Go survive more than one request?

It is possible to write PHP scripts that live longer than a few minutes (up to hours or days): for example, cron tasks, CSV parsers, queue breakers. They all work according to the same scenario: they retrieve a task, execute it, and wait for the next one. The code resides in memory all the time, saving precious milliseconds as there are many additional steps required to load the framework and application.

But developing long-lived scripts is not easy. Any error completely kills the process, diagnosing memory leaks is infuriating, and F5 debugging is no longer possible.

The situation has improved with the release of PHP 7: a reliable garbage collector has appeared, it has become easier to handle errors, and kernel extensions are now leak-proof. True, engineers still need to be careful with memory and be aware of state issues in code (is there a language that can ignore these things?). Still, PHP 7 has fewer surprises in store for us.

Is it possible to take the model of working with long-lived PHP scripts, adapt it to more trivial tasks like processing HTTP requests, and thereby get rid of the need to load everything from scratch with each request?

To solve this problem, we first needed to implement a server application that could accept HTTP requests and redirect them one by one to the PHP worker without killing it every time.

We knew that we could write a web server in pure PHP (PHP-PM) or using a C extension (Swoole). And although each method has its own merits, both options did not suit us - we wanted something more. We needed more than just a web server - we expected to get a solution that could save us from the problems associated with a β€œhard start” in PHP, which at the same time could be easily adapted and extended for specific applications. That is, we needed an application server.

Can Go help with this? We knew it could because the language compiles applications into single binaries; it is cross-platform; uses its own, very elegant, parallel processing model (concurrency) and a library for working with HTTP; and finally, thousands of open-source libraries and integrations will be available to us.

The Difficulties of Combining Two Programming Languages

First of all, it was necessary to determine how two or more applications will communicate with each other.

For example, using beautiful library Alex Palaestras, it was possible to share memory between PHP and Go processes (similar to mod_php in Apache). But this library has features that limit its use for solving our problem.

We decided to use a different, more common approach: to build interaction between processes through sockets / pipelines. This approach has proven to be reliable over the past decades and has been well optimized at the operating system level.

To begin with, we created a simple binary protocol for exchanging data between processes and handling transmission errors. In its simplest form, this type of protocol is similar to netstring с fixed size packet header (in our case 17 bytes), which contains information about the type of packet, its size and a binary mask to check the integrity of the data.

On the PHP side we used pack function, and on the Go side, the library encoding/binary.

It seemed to us that one protocol was not enough - and we added the ability to call net/rpc go services directly from PHP. Later, this helped us a lot in development, since we could easily integrate Go libraries into PHP applications. The result of this work can be seen, for example, in our other open-source product Goridge.

Distributing tasks across multiple PHP workers

After implementing the interaction mechanism, we began to think about the most efficient way to transfer tasks to PHP processes. When a task arrives, the application server must choose a free worker to execute it. If a worker/process exits with an error or "dies", we get rid of it and create a new one to replace it. And if the worker/process has completed successfully, we return it to the pool of workers available to perform tasks.

RoadRunner: PHP is not built to die, or Golang to the rescue

To store the pool of active workers, we used buffered channel, to remove unexpectedly β€œdead” workers from the pool, we added a mechanism for tracking errors and states of workers.

As a result, we got a working PHP server capable of processing any requests presented in binary form.

In order for our application to start working as a web server, we had to choose a reliable PHP standard to represent any incoming HTTP requests. In our case, we just transform net/http request from Go to format PSR-7so that it is compatible with most of the PHP frameworks available today.

Because PSR-7 is considered immutable (some would say technically it isn't), developers have to write applications that don't treat the request as a global entity in principle. This fits nicely with the concept of long-lived PHP processes. Our final implementation, which has yet to be named, looked like this:

RoadRunner: PHP is not built to die, or Golang to the rescue

Introducing RoadRunner - high performance PHP application server

Our first test task was an API backend, which periodically bursts unpredictably (much more often than usual). Although nginx was sufficient in most cases, we regularly encountered 502 errors because we could not balance the system quickly enough for the expected increase in load.

To replace this solution, we deployed our first PHP/Go application server in early 2018. And immediately got an incredible effect! Not only did we get rid of the 502 error completely, but we were able to reduce the number of servers by two-thirds, saving a lot of money and headache pills for engineers and product managers.

By the middle of the year, we had improved our solution, published it on GitHub under the MIT license and named it RoadRunner, thus emphasizing its incredible speed and efficiency.

How RoadRunner can improve your development stack

Application RoadRunner allowed us to use Middleware net/http on the Go side to perform JWT verification before the request reaches PHP, as well as handle WebSockets and aggregate state globally in Prometheus.

Thanks to the built-in RPC, you can open the API of any Go libraries for PHP without writing extension wrappers. More importantly, with RoadRunner you can deploy new non-HTTP servers. Examples include running handlers in PHP AWS Lambda, creating reliable queue breakers, and even adding gRPC to our applications.

With the help of the PHP and Go communities, we improved the stability of the solution, increased application performance up to 40 times in some tests, improved debugging tools, implemented integration with the Symfony framework, and added support for HTTPS, HTTP/2, plugins, and PSR-17.

Conclusion

Some people are still caught in the outmoded notion of PHP as a slow, unwieldy language only good for writing plugins for WordPress. These people might even say that PHP has such a limitation: when the application gets big enough, you have to choose a more β€œmature” language and rewrite the code base accumulated over many years.

To all this I want to answer: think again. We believe that only you set any restrictions for PHP. You can spend your whole life transitioning from one language to another, trying to find the perfect match for your needs, or you can start to think of languages ​​as tools. The supposed flaws of a language like PHP may actually be the reason for its success. And if you combine it with another language like Go, then you will create much more powerful products than if you were limited to using any one language.

Having worked with a bunch of Go and PHP, we can say that we love them. We do not plan to sacrifice one for the other - on the contrary, we will look for ways to get even more value from this dual stack.

UPD: we welcome the creator of RoadRunner and the co-author of the original article - Lachesis

Source: habr.com

Add a comment