Analysis of tasks from the Hydra conference - load balancing and in-memory storage

Happened a few days ago Hydra Conference. The guys from the JUG.ru Group invited dream speakers (Leslie Lamport! Cliff Click! Martin Kleppmann!) and devoted two days to distributed systems and computing. Kontur was one of the three partners of the conference. We talked at the booth, talked about our distributed storage, played bingo, and solved puzzles.

This is a post with an analysis of tasks at the Kontur stand from the author of their text. Who was on the Hydra - this is your reason to remember the pleasant experience, who was not - a chance to stretch your brain big O-notation.

There were even participants who dismantled the flipchart into slides to write down their decision. I'm not joking - they handed over this stack of paper for verification:

Analysis of tasks from the Hydra conference - load balancing and in-memory storage

There were three tasks in total:

  • about selecting replicas by weights for load balancing
  • about sorting query results against an in-memory database
  • on state transfer in a distributed system with a ring topology

Task 1. ClusterClient

It was necessary to propose an algorithm for the efficient selection of K from N weighted replicas of a distributed system:

Your team is tasked with developing a client library for a massively distributed cluster of N nodes. The library would keep track of various metadata associated with nodes (eg, their latencies, 4xx/5xx response rates, etc.) and assign floating point weights W1..WN to them. In order to support the concurrent execution strategy, the library should be able to pick K of N nodes randomlyβ€”a chance of being selected should be proportional to a node's weight.

Propose an algorithm to select nodes efficiently. Estimate its computational complexity using big O notation.

Why is everything in English?

Because in this form the conference participants fought with them and because English was the official language of Hydra. The tasks looked like this:

Analysis of tasks from the Hydra conference - load balancing and in-memory storage

Take paper and pencil, think, don't rush to open spoilers right away πŸ™‚

Analysis of the solution (video)

Starting at 5:53, only 4 minutes:

And here is how the guys with the flipchart pitched their solution:


Analysis of the solution (text)

The following solution lies on the surface: sum the weights of all replicas, generate a random number from 0 to the sum of all weights, then choose an i-replica such that the sum of replica weights from 0 to (i-1)th is less than a random number, and the sum of replica weights from 0 to i-th - more than it. So it will be possible to select one replica, and to select the next one, you need to repeat the entire procedure without considering the selected replica. With such an algorithm, the complexity of choosing one replica is O(N), the complexity of choosing K replicas is O(N K) ~ O(N2).

Analysis of tasks from the Hydra conference - load balancing and in-memory storage

Quadratic complexity is bad, but it can be improved. To do this, we will build segment tree for sums of weights. A tree of depth lg N will be obtained, in the leaves of which there will be replica weights, and in the remaining nodes - partial sums, up to the sum of all weights at the root of the tree. Next, we generate a random number from 0 to the sum of all weights, find the i-th replica, remove it from the tree, and repeat the procedure to find the remaining replicas. With this algorithm, the complexity of building a tree is O(N), the complexity of finding the i-th replica and removing it from the tree is O(lg N), the complexity of choosing K replicas is O(N + K lg N) ~ O(N lg N) .

Analysis of tasks from the Hydra conference - load balancing and in-memory storage

Linear-log complexity is nicer than quadratic complexity, especially for large K.

It is this algorithm implemented in code ClusterClient libraries from the project "East". (There, the tree is built in O(N lg N), but this does not affect the final complexity of the algorithm.)

Task 2. Zebra

It was necessary to propose an algorithm for efficient sorting of documents in memory by an arbitrary non-indexed field:

Your team is tasked with developing a sharded in-memory document database. A common workload would be to select top N documents sorted by an arbitrary (non-indexed) numeric field from a collection of size M (usually N < 100 << M). A slightly less common workload would be to select top N after skipping top S documents (S ~ N).

Propose an algorithm to execute such queries efficiently. Estimate its computational complexity using big O notation in the average case and the worst case scenarios.

Analysis of the solution (video)

Starting at 34:50, only 6 minutes:


Analysis of the solution (text)

Surface solution: sort all documents (for example with quicksort), then take N+S documents. In this case, the complexity of sorting is on average O(M lg M), at worst O(M2).

It is obvious that sorting all M documents and then taking only a small part of them is inefficient. In order not to sort all documents, an algorithm is suitable quick select, which will select N + S of the desired documents (they can be sorted by any algorithm). In this case, the complexity will decrease to O(M) on average, while the worst case will remain the same.

However, you can do it even more efficiently - use the algorithm binary heap streaming. In this case, the first N+S documents are added to min- or max-heap (depending on the sort direction), and then each next document is compared to the root of the tree, which contains the current minimum or maximum document, and is added to the tree if necessary. . In this case, the complexity in the worst case, when you have to constantly rebuild the tree, is O(M lg M), the complexity on average is O(M), as with quickselect.

However, heap streaming turns out to be more efficient due to the fact that in practice most of the documents can be discarded without rebuilding the heap after a single comparison with its root element. Such sorting is implemented in the Zebra in-memory document database developed and used in Kontur.

Task 3. State swaps

It was necessary to propose the most efficient algorithm for shifting states:

Your team is tasked with developing a fancy state exchange mechanism for a distributed cluster of N nodes. The i-th node's state should be transferred to the (i+1)-th node, the N-th node's state should be transferred to the first node. The only supported operation is the state swap when two nodes exchange their states atomically. It is known that a state swap takes M milliseconds. Every node is able to participate in a single state swap at any given moment.

How long does it take to transfer the states of all nodes in a cluster?

Analysis of the solution (text)

Surface solution: exchange the states of the first and second element, then the first and third, then the first and fourth, and so on. After each exchange, the state of one element will be in the desired position. You have to make O(N) permutations and spend O(N M) time.

Analysis of tasks from the Hydra conference - load balancing and in-memory storage

Linear time is long, so you can exchange the states of elements in pairs: the first with the second, the third with the fourth, and so on. After each state exchange, every second element will be in the right position. You have to make O(lg N) permutations and spend O(M lg N) time.

Analysis of tasks from the Hydra conference - load balancing and in-memory storage

However, it is possible to make the shift even more efficient - not in linear, but in constant time. To do this, at the first step, you need to exchange the state of the first element with the last one, the second with the penultimate one, and so on. The state of the last element will be in the correct position. And now we need to exchange the state of the second element with the last one, the third one with the penultimate one, and so on. After this round of exchanges, the states of all elements will be in the right position. There will be O(2M) ~ O(1) permutations in total.

Analysis of tasks from the Hydra conference - load balancing and in-memory storage

Such a solution will not at all surprise a mathematician who still remembers that a rotation is a composition of two axial symmetries. By the way, it is trivially generalized for a shift not by one, but by K < N positions. (Write in the comments how exactly.)

Did you like puzzles? Do you know other solutions? Share in the comments.

And here are some useful links in the end:

Source: habr.com

Add a comment