
The development of high-load projects in any language requires a special approach and the use of special tools, but when it comes to PHP applications, the situation can escalate so much that you have to develop, for example, . In this note, we will talk about the pain familiar to everyone with distributed storage of sessions and data caching in memcached and how we solved these problems in one “ward” project.
The hero of the occasion is a PHP application based on the symfony 2.3 framework, which is not included in the business plans to update at all. In addition to the quite standard storage of sessions, this project made full use of "cache everything" policy in memcached: responses to requests to the database and API servers, various flags, locks for synchronizing code execution, and much more. In such a situation, a memcached break becomes fatal for the application to work. In addition, the loss of the cache leads to serious consequences: the DBMS starts to burst at the seams, API services start to ban requests, etc. It may take tens of minutes for the situation to stabilize, during which time the service will slow down terribly or even become unavailable.
We needed to provide the possibility of horizontal scaling of the application with little bloodshed, i.e. with minimal changes to the source code and full functionality. Make the cache not only fault-tolerant, but also try to minimize data loss from it.
What is wrong with memcached itself?
In general, the memcached extension for PHP out of the box supports distributed storage of data and sessions. The mechanism of consistent hashing of keys allows you to evenly place data on many servers, uniquely addressing each specific key to a specific server from the group, and the built-in failover tools ensure high availability of the caching service (but, unfortunately, no data).
With session storage, things are a little better: you can configure memcached.sess_number_of_replicas, as a result of which the data will be saved to several servers at once, and in the event of a failure of one instance of memcached, the data will be given from others. However, if the server comes back online without data (as it usually happens after a restart), some of the keys will be redistributed in its favor. In fact, this will mean session data loss, since there is no way to “go” to another replica in case of a miss.
The standard tools of the library are aimed mainly at horizontal scaling: they allow you to increase the cache to gigantic sizes and provide access to it from code hosted on different servers. However, in our situation, the amount of stored data does not exceed several gigabytes, and the performance of one or two nodes is quite enough. Accordingly, from a useful regular means, they could only ensure the availability of memcached while maintaining at least one instance of the cache in working condition. However, even this opportunity was not taken advantage of... Here we should recall the antiquity of the framework used in the project, which is why it was not possible to make the application work with a pool of servers. Let's also not forget about the loss of session data: the customer's eye twitched from the massive logout of users.
Ideally, it required memcached entry replication and bypassing replicas in case of miss or error. Helped us implement this strategy .
mcrouter
This is a memcached router developed by Facebook to solve their problems. It supports the memcached text protocol, which allows scale memcached installations to insane proportions. A detailed description of mcrouter can be found in . In addition to other he can do what we need:
- replicate the record;
- make a fallback to other servers in the group in case of an error.
For the cause!
mcrouter configuration
Let's go straight to the config:
{
"pools": {
"pool00": {
"servers": [
"mc-0.mc:11211",
"mc-1.mc:11211",
"mc-2.mc:11211"
},
"pool01": {
"servers": [
"mc-1.mc:11211",
"mc-2.mc:11211",
"mc-0.mc:11211"
},
"pool02": {
"servers": [
"mc-2.mc:11211",
"mc-0.mc:11211",
"mc-1.mc:11211"
},
"route": {
"type": "OperationSelectorRoute",
"default_policy": "AllMajorityRoute|Pool|pool00",
"operation_policies": {
"get": {
"type": "RandomRoute",
"children": [
"MissFailoverRoute|Pool|pool02",
"MissFailoverRoute|Pool|pool00",
"MissFailoverRoute|Pool|pool01"
]
}
}
}
}Why three pools? Why are servers repeated? Let's see how it works.
- In this configuration, mcrouter chooses the path to which the request will be sent based on the request command. The type tells him about it
OperationSelectorRoute. - GET requests go to the handler
RandomRoute, which randomly selects a pool or route among array objectschildren. Each element of this array is in turn a handlerMissFailoverRoute, which iterates through each server in the pool until it receives a response with data, which is returned to the client. - If we only used
MissFailoverRoutewith a pool of three servers, then all requests would come first to the first instance of memcached, and the rest would receive requests on a residual basis when there is no data. Such an approach would lead to excessive load on the first server in the list, so it was decided to generate three pools with addresses in a different sequence and choose them randomly. - All other requests (and this is a record) are processed using
AllMajorityRoute. This handler sends requests to all servers in the pool and waits for responses from at least N/2 + 1 of them. From useAllSyncRoutefor write operations had to be abandoned, since this method requires a positive response from all group servers - otherwise it will returnSERVER_ERROR. Although mcrouter will add the data to the available caches, the calling PHP function will return an error and generate notice.AllMajorityRouteis not so strict and allows decommissioning up to half of the units without the problems described above.
Main minus this scheme is that if there really is no data in the cache, then for each request from the client, N requests to memcached will actually be executed - which all servers in the pool. You can reduce the number of servers in pools, for example, to two: sacrificing storage reliability, we get bоGreater speed and less load from requests for missing keys.
NB: Useful links for learning mcrouter may also be и (including closed ones), representing a whole storehouse of various configurations.
Building and running mcrouter
The application (and memcached itself) works for us in Kubernetes - accordingly, mcrouter is also there. For container assembly we use , the config for which will look like this:
NB: The listings given in the article are published in the repository .
configVersion: 1
project: mcrouter
deploy:
namespace: '[[ env ]]'
helmRelease: '[[ project ]]-[[ env ]]'
---
image: mcrouter
from: ubuntu:16.04
mount:
- from: tmp_dir
to: /var/lib/apt/lists
- from: build_dir
to: /var/cache/apt
ansible:
beforeInstall:
- name: Install prerequisites
apt:
name: [ 'apt-transport-https', 'tzdata', 'locales' ]
update_cache: yes
- name: Add mcrouter APT key
apt_key:
url: https://facebook.github.io/mcrouter/debrepo/xenial/PUBLIC.KEY
- name: Add mcrouter Repo
apt_repository:
repo: deb https://facebook.github.io/mcrouter/debrepo/xenial xenial contrib
filename: mcrouter
update_cache: yes
- name: Set timezone
timezone:
name: "Europe/Moscow"
- name: Ensure a locale exists
locale_gen:
name: en_US.UTF-8
state: present
install:
- name: Install mcrouter
apt:
name: [ 'mcrouter' ]()
... and throw Helm chart. Of the interesting - here is only a config generator from the number of replicas (if someone has a more concise and elegant version - share in the comments):
{{- $count := (pluck .Values.global.env .Values.memcached.replicas | first | default .Values.memcached.replicas._default | int) -}}
{{- $pools := dict -}}
{{- $servers := list -}}
{{- /* Заполняем массив двумя копиями серверов: "0 1 2 0 1 2" */ -}}
{{- range until 2 -}}
{{- range $i, $_ := until $count -}}
{{- $servers = append $servers (printf "mc-%d.mc:11211" $i) -}}
{{- end -}}
{{- end -}}
{{- /* Смещаясь по массиву, получаем N срезов: "[0 1 2] [1 2 0] [2 0 1]" */ -}}
{{- range $i, $_ := until $count -}}
{{- $pool := dict "servers" (slice $servers $i (add $i $count)) -}}
{{- $_ := set $pools (printf "MissFailoverRoute|Pool|pool%02d" $i) $pool -}}
{{- end -}}
---
apiVersion: v1
kind: ConfigMap
metadata:
name: mcrouter
data:
config.json: |
{
"pools": {{- $pools | toJson | replace "MissFailoverRoute|Pool|" "" -}},
"route": {
"type": "OperationSelectorRoute",
"default_policy": "AllMajorityRoute|Pool|pool00",
"operation_policies": {
"get": {
"type": "RandomRoute",
"children": {{- keys $pools | toJson }}
}
}
}
}()
We roll out to the test environment and check:
# php -a
Interactive mode enabled
php > # Проверяем запись и чтение
php > $m = new Memcached();
php > $m->addServer('mcrouter', 11211);
php > var_dump($m->set('test', 'value'));
bool(true)
php > var_dump($m->get('test'));
string(5) "value"
php > # Работает! Тестируем работу сессий:
php > ini_set('session.save_handler', 'memcached');
php > ini_set('session.save_path', 'mcrouter:11211');
php > var_dump(session_start());
PHP Warning: Uncaught Error: Failed to create session ID: memcached (path: mcrouter:11211) in php shell code:1
Stack trace:
#0 php shell code(1): session_start()
#1 {main}
thrown in php shell code on line 1
php > # Не заводится… Попробуем задать session_id:
php > session_id("zzz");
php > var_dump(session_start());
PHP Warning: session_start(): Cannot send session cookie - headers already sent by (output started at php shell code:1) in php shell code on line 1
PHP Warning: session_start(): Failed to write session lock: UNKNOWN READ FAILURE in php shell code on line 1
PHP Warning: session_start(): Failed to write session lock: UNKNOWN READ FAILURE in php shell code on line 1
PHP Warning: session_start(): Failed to write session lock: UNKNOWN READ FAILURE in php shell code on line 1
PHP Warning: session_start(): Failed to write session lock: UNKNOWN READ FAILURE in php shell code on line 1
PHP Warning: session_start(): Failed to write session lock: UNKNOWN READ FAILURE in php shell code on line 1
PHP Warning: session_start(): Failed to write session lock: UNKNOWN READ FAILURE in php shell code on line 1
PHP Warning: session_start(): Unable to clear session lock record in php shell code on line 1
PHP Warning: session_start(): Failed to read session data: memcached (path: mcrouter:11211) in php shell code on line 1
bool(false)
php >The search for the text of the error did not give any result, however, the query "» in the forefront was the oldest unsolved problem of the project - binary memcached protocol.
NB: The ASCII protocol in memcached is slower than the binary one, and the standard means of consistent key hashing work only with the binary protocol. But this does not create problems for a particular case.
It's in the bag: it remains only to switch to the ASCII protocol and everything will work .... However, in this case, the habit of looking for answers in played a cruel joke. You will not find the correct answer there ... unless, of course, you scroll to the end, where in the section "User Contributed Notes" will be true and .
Yes, the correct option name is memcached.sess_binary_protocol. It must be disabled, after which the sessions will start working. It remains only to put the container with mcrouter in the pod with PHP!
Conclusion
Thus, with only infrastructural changes, we managed to solve the task: the issue with memcached fault tolerance has been resolved, the reliability of cache storage has been increased. In addition to the obvious advantages for the application, this gave room for maneuver when working on the platform: when all components have a reserve, the administrator's life is greatly simplified. Yes, this method has its drawbacks, it may look like a "crutch", but if it saves money, buries the problem and does not cause new ones - why not?
P.S.
Read also on our blog:
- "Practice with dapp" (on the example of symfony-demo): и ;
- «».
Source: habr.com
