19 hydra heads. Great overview of the program

On July 11-12, a conference will be held in St. Petersburg Hydradedicated to the development of parallel and distributed systems. The thing about Hydra is that it combines top scientists (who are usually found only at foreign scientific conferences) and well-known practicing engineers into one big program at the intersection of science and practice.

Hydra is one of our most important conferences in recent years. It was preceded by very serious preparation, selection of speakers and reports. about it last week habrointerview came out with the director of the JUG.ru Group, Alexey Fedorov (23derevo).

Мы already told about three important participants, the founders of the theory of distributed systems - Leslie Lamport, Maurice Herlihy and Michael Scott. It's time to talk more about the whole program!

19 hydra heads. Great overview of the program

Motivation

If you are programming, then one way or another you are dealing with multithreading and distributed computing. Specialists in the relevant fields work with them directly, but implicitly, distribution looks at us from everywhere: in any multi-core computer or distributed service, there is something that performs calculations in parallel.

There are many conferences that cover certain aspects of application programming. On the other side of the spectrum, we have special scientific schools, in the format of lectures, revealing huge amounts of complex theory. For example, in parallel with Hydra in St. Petersburg, SPTDC school. At the Hydra conference, we tried to bring together the harsh practice, and science, and everything that is at their intersection.

Think about this: we live in an amazing time where you can meet the founders of the field of science and engineering that we are engaged in live. Physicists will not meet either Newton or Einstein - the train has left. But those who created the foundations of the theory of distributed systems, invented popular programming languages, and for the first time embodied all this into working prototypes still live next to us. These people have not quit their jobs halfway through, are currently doing real-world tasks at world-renowned universities and companies, and are today's greatest sources of knowledge and experience.

On the other hand, the opportunity to meet them usually remains purely theoretical: few of us can constantly monitor public events at some University of Rochester in order to then rush to the USA and back to a lecture by Michael Scott. Visiting all the members of Hydra in general would be a small fortune, apart from the abyss of time spent (although it sounds like an interesting quest).

On the other hand, we have a lot of top engineers who are working on the actual problems of distributed systems right now, and they definitely have something to tell. But here's the problem - they workingand their time is precious. Yes, if you are an employee of Microsoft, Google or JetBrains, the probability of meeting one of the well-known speakers at an internal event increases dramatically, but in general - no, it does not happen every day.

In this way, the Hydra conference accomplishes an important task that most of us cannot do on our own - in one place and at one time, brings together people whose ideas or communication with whom can change your life. I admit that not everyone needs distributed systems, some complex fundamental things. You can program CRUDs in PHP for the rest of your life and be completely happy. But who needs it - this is your chance.

Quite a lot of time has passed since the first announcement of the Hydra conference on Habré. During this time, a lot of work has been done - and now, we have a list of almost all reports. No sluggish single-threaded algorithms, just pure distributed hardcore! Let's finish with general words, and see what we have on hand now.

Keynotes

Keynotes start and end the days of the conference. Usually the purpose of the opening keynote is to set the general spirit and direction of the conference. The closing keynote draws a line and explains how we can live with the knowledge and skills acquired during the days of the conference. Beginning and end: what is remembered best, and in general, has an increased value.

Cliff Click- The H2O distributed K/V algorithm

19 hydra heads. Great overview of the program Cliff is a legend in the Java world. In the late 90s, for a PhD thesis, he wrote a paper called "Combining Analyzes, Combining Optimizations", which after some time became the basis for the HotSpot JVM Server Compiler. Two years later, he was already working at Sun Microsystems on the JVM and showed the whole world that the JIT has a right to exist. This whole story that Java is one of the fastest modern runtimes with the smartest and fastest optimizations came from Cliff Click. At the very beginning, it was believed that if something is available to a static compiler, you can not even try to jit it. Thanks to the work of Cliff and the team, all new languages ​​began to be created with the idea of ​​JIT compilation by default. Of course, this was not the work of one person, but Cliff played a very important role in it.

In the opening keynote, Cliff will talk about his other undertaking - H20, an in-memory platform for distributed and scalable machine learning for industrial applications. More precisely, about the distributed storage of key-value pairs inside it. This is a very fast storage with a lot of interesting properties (the exact list is in description) that allow the use of similar solutions in the mathematics of big data streaming.

Another talk that Cliff will give is The Azul Hardware Transactional Memory experience. Another part of his biography - ten years works in Azul, where he updated and improved a lot of things in the Azul hardware and technology stack: JIT compilers, runtime, thread model, error handling, stack manipulation, hardware interrupts, class loading, and so on and so forth - well, you get the idea.

The most interesting part began when they made hardware for big business - a supercomputer to run Java. It was quite an innovative thing, tailored specifically for Java, which has special requirements - memory barriers for reading for low-pause garbage collection, arrays with bounds checking, virtual calls ... One of the coolest technologies is hardware transactional memory. The entire L1 of any of the 864 cores could participate in a transactional write, which is especially important for working with locks in Java (synchronized blocks can work in parallel, as long as there is no real memory conflict). But the beautiful idea crashed against the harsh reality - and in this report Cliff will tell you why HTM and STM are not well suited for the practical needs of multi-threaded computing.

Michael Scott- Dual data structures

19 hydra heads. Great overview of the program Michael Scott - Professor of Computer Science at the University of Rochester, with whom fate linked him for 34 years already, and at his native University of Wisconsin–Madison, was dean for five years. He is engaged in research in the field of parallel and distributed programming and language design and teaches this to students.

The whole world knows Michael thanks to the textbook "Programming Language Pragmatics", the latest edition of which was published relatively recently - in 2015. His job "Algorithms for scalable synchronization on shared-memory multiprocessors" received Dijkstra Prize as one of the most famous in the field of distributed computing and lies openly at the online library of the University of Rochester. You may also know him as the author of the same Michael-Scott algorithm from "Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms".

As for the Java world, this is a special case: together with Doug Lea, he developed those non-blocking algorithms and synchronous queues that Java libraries run on. This is what the “Dual data structures” keynote will be about - the introduction of these structures in Java SE 6 allowed us to improve performance by 10 times java.util.concurrent.ThreadPoolExecutor. If you are interested in advance what these “Dual data structures” are, then there is related work.

Maurice Herlihy- Blockchains and the future of distributed computing

19 hydra heads. Great overview of the program Maurice Herlihy - Winner of two Dijkstra awards. The first is for work "Wait-Free Synchronization" (Brown University), and the second, more recent - "Transactional Memory: Architectural Support for Lock-Free Data Structures" (Virginia Tech University). The Dijkstra Prize is given for works whose significance and influence have been noticeable for at least ten years, and it is clear that Maurice is one of the most famous specialists in the field. He is currently a professor at Brown University and has paragraph-long accomplishments.

In this closing keynote, Maurice will talk about the theory and practice of blockchain distributed systems from the point of view of the classics of distributed computing and how it simplifies many related problems. This report is exclusively on the topic of the conference - not at all about the mining hype, but rather about how our knowledge can be amazingly effectively and appropriately used in relation to a variety of tasks.

In July 2017, Maurice already came to Russia to the SPTDC school, participated in the JUG.ru meetup, and the recording can be viewed on YouTube:

Main program

Then there will be a short review of the reports included in the program. Some of the reports are described in detail here, some more briefly. Long descriptions went mainly to English-language reports that require links to scientific papers, terms on Wikipedia, and so on. A complete list can be see on the conference website. The list on the site will be updated and supplemented.

Leslie Lamport- Q & A

19 hydra heads. Great overview of the program Leslie Lamport is a pioneering author of distributed computing. LaTeX stands for "Lamport TeX". It was he who for the first time, back in 1979, introduced the concept consistent consistency, and his article "How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs" received the Dijkstra Prize.

This is the most unusual part of the program in terms of format, because it is not even a report, but a question and answer session. When a significant part of the audience is already familiar (or can become familiar) with all sorts of works based on Lamport's theory, his own articles and reports, it is more important to spend all available time on direct communication.

The idea is simple - you watch two reports on YouTube: "Programming Should Be More Than Coding" и "If You're Not Writing a Program, Don't Use a Programming Language" and prepare at least one question, and Leslie answers.

The first of these two videos we have already turned into bullshit. If you do not have an hour of time to watch the video, you can quickly read it all in text form.

Note: There are many more videos on YouTube with Leslie Lamport. For example, there is an excellent TLA+ course. An offline version of this entire course is available at author's home page, and on YouTube he poured it for more convenient viewing on mobile devices.

Martin Kleppman - Syncing data across user devices for distributed collaboration

19 hydra heads. Great overview of the program Martin Kleppmann is a researcher at the University of Cambridge working on CRDT and formal algorithm verification. Martin's book "Designing Data-Intensive Applications", published in 2017, proved to be very successful and hit the bestseller lists in the field of data storage and processing. Kevin Scott, CTO at Microsoft once said: “This book should be a must for design engineers. This is a rare resource that bridges theory and practice to help developers design and implement data infrastructure and systems smarter.” Something similar was said by the creator of Kafka and CTO Confluent, Jay Kreps.

Before moving into academic research, Martin worked in the industry and co-founded two successful startups:

  • Rapportive, dedicated to displaying the social profile of your email contacts, which LinkedIn bought in 2012;
  • Go Test It, a multi-browser automated website checker that RedGate bought in 2009.

In general, although Martin is less known than our keynoters, he has already been able to make some contribution to the development of distributed computing and to the industry.

In this talk, Martin will talk about a topic closer to his academic research. In Google Docs and a similar sofa for co-authoring documents, "co-authoring" means a replication task: each user has their own replica of a shared document, which they then modify, and all changes are sent over the network to the rest of the participants. Offline changes to documents result in temporary document inconsistency with respect to other participants, and re-synchronization requires conflict handling. Just for this there are Conflict-free Replicated Data Types (CRDT), in fact, is a fairly new thing, the essence of which was formulated only in 2011. This talk discusses what has happened since then in the CRDT world, what are the latest developments, discusses the approach to building local-first applications in general, and the use of an open source library Automerge in particular.

Next week we will publish a big interview with Martin on Habré, it will be interesting.

Pedro Ramalhete - Wait-free data structures and wait-free transactions

19 hydra heads. Great overview of the program Pedro works at Cisco and has been developing parallel algorithms for the last ten years, including synchronization mechanisms, lock-free and wait-free data structures, and everything you can think of on this topic. His current research and engineering interests are focused on Universal Constructions, Software Transactional Memory, Persistent Memory, and similar technologies to enable correct, scalable, and fault-tolerant applications. And he is also the author of a blog widely known in narrow circles Concurrency Freaks.

Most multi-threaded applications now run on parallel data structures, from using message queues between actors to indexed data structures in key-value stores. They have been successfully working in the Java JDK for many years, and they are slowly being added to C ++.

The simplest way to implement a parallel data structure is a serial (single-threaded) implementation in which the methods are protected by mutexes. This is available to any jun, but has obvious scaling and performance issues. At the same time, lock-free and wait-free data structures not only handle errors better, but also have a better performance profile - however, their development requires deep expertise and adaptation to a specific use case. One wrong line of code is enough to break everything.

How to make sure that even a non-expert can design and implement such data structures? It is known that any sequential algorithm can be made thread-safe by either universal design, or transactional memory. For one, they can lower the entry threshold for solving this problem. However, both solutions tend to lead to inefficient implementations. Pedro will talk about how they managed to make these constructions more efficient and how they can be used for their algorithms.

Heidi Howard- Liberating distributed consensus

19 hydra heads. Great overview of the program Heidi Howard is, like Martin, a distributed systems researcher at the University of Cambridge. Her specialization is consistency, fault tolerance, performance and distributed consensus. She is best known for her generalization of the Paxos algorithm called Flexible Paxos.

Recall that Paxos - a family of protocols for solving the problem of consensus in a network of unreliable computers, which were based on the work of Leslie Lamport. Thus, some of our speakers are working on tasks originally proposed by our other speakers - and this is wonderful.

The ability to find consensus among multiple hosts—for addressing, leader selection, blocking, or coordination—is a fundamental issue in today's distributed systems. Paxos is now the main way to solve consensus problems, and there is a lot of research around it in order to extend and optimize the algorithm for various practical needs.

In this report, we will revisit the theoretical basis of Paxos, relaxing the initial requirements and generalizing the algorithm. We will see that Paxos is, in fact, only one of the options among a huge range of approaches to consensus, and that other points on the spectrum are also quite useful for building good distributed systems.

Alex Petrov - Reduce your storage costs with Transient Replication and Cheap Quorums

19 hydra heads. Great overview of the program Alex is a database and storage specialist and, more importantly, a committer at Cassandra. He is currently working with O'Reilly on the Database Internals book.

For systems with eventual consistency (in Russian terminology - “ultimately consistent”), after a node crash or network split, the following dilemma needs to be resolved: either continue to fulfill requests, sacrificing consistency, or refuse to execute them and sacrifice availability. In such a system, quorums, overlapping subsets of nodes and ensuring that at least one node contains the most recent value, can be a good edge solution. It is possible to survive failures and loss of connection to some nodes while continuing to respond with the most recent values.

However, everything has its price. A quorum replication scheme means an increased cost of storage: you must store redundant data on multiple nodes at once to ensure that enough copies are available when a problem occurs. It turns out that you can not store all the data on all replicas. You can reduce the load on storage if you keep data only on a part of the nodes, and use special nodes (Transient Replica) for failure handling scenarios.

In the course of the report, we will consider Witness Replicas, the replication scheme used in Spanner и megastore, and the implementation of this concept in Apache Cassandra under the names Transient Replication & Cheap Quorums.

Dmitry Vyukov - Goroutines exposed

19 hydra heads. Great overview of the program Dmitry is a developer at Google working on C/C++ and Go dynamic testing - Address/Memory/ThreadSanitizer and similar tools for the Linux kernel. He has contributed a scalable goroutine scheduler, a network poller, and a concurrent garbage collector to Go. He is an expert in multithreading, the author of a dozen new non-blocking algorithms and is the owner of Black Belt Intel.

Now a little about the report itself. The Go language has native support for multithreading in the form of goroutines (lightweight threads) and channels (FIFO queues). Thanks to these mechanisms, it is very easy and pleasant for users to write modern multithreaded applications, and it looks like magic. As we understand, there is no magic here. In this report, Dmitry will delve into the intricacies of the Go scheduler and show the secrets of implementing this “magic”. First, he will give an overview of the main components of the scheduler, how it works. Next, we will take a closer look at certain aspects, such as the parking / unparking strategy and the handling of blocking system calls. Finally, Dmitry will talk a little about possible improvements in the scheduler.

Dmitry Bugaichenko - Accelerating Distributed Graph Analysis with Probabilistic Sketches and More

19 hydra heads. Great overview of the program Dmitry has worked in outsourcing for almost 9 years, without losing contact with the university and the scientific community. Big data analysis in Odnoklassniki was a unique chance for him to combine theoretical training and scientific foundation with the development of real, in-demand products.

Distributed graph analysis has been and remains a difficult task: when it becomes necessary to obtain information about the connections of a neighboring vertex, the data often has to be transferred between machines, which leads to an increase in execution time and a load on the network infrastructure. In this talk, we will see how you can get a significant speedup of processing using probabilistic data structures or facts like the symmetry of the friendship graph in a social network. All this is illustrated with Apache Spark code examples.

Denis Rystsov - Reduce your storage costs with Transient Replication and Cheap Quorums

19 hydra heads. Great overview of the program Denis - developer Cosmos DB, an expert in consistency model validation, consensus algorithms, and distributed transactions. Now he works at Microsoft, and before that he was engaged in distributed systems at Amazon and Yandex.

In this report, we will get acquainted with the distributed transaction protocols that have been invented over the past few years, which can be implemented on the client side on top of any data store that supports conditional update (compare and set). The bottom line is that life does not end with a two-phase commit, transactions can be added on top of any databases - at the application level, but different protocols (2PC, Percolator, RAMP) have different tradeoffs and are not given to us for free.

Alexey Zinoviev - Not all ML algorithms end up in a distributed paradise

19 hydra heads. Great overview of the program Alexei (zaleslaw) is our longtime speaker and member of program committees at other conferences. A practicing trainer at EPAM Systems, and has been friends with Hadoop / Spark and other bigdata since 2012.

In this talk, Alexey will talk about the problems of adapting classical machine learning algorithms for distributed execution based on his experience with Apache Spark ML, Apache Mahout, Apache Flink ML and the experience of creating Apache Ignite ML. Alexey will also talk about the implementation of distributed ML algorithms in these frameworks.

And in conclusion, two reports from Yandex about Yandex Database.

Vladislav Kuznetsov — Yandex Database - how we provide fault tolerance

19 hydra heads. Great overview of the program Vladislav is a developer at Yandex in the distributed platform group. Yandex Database is a horizontally scalable, geo-distributed, fault-tolerant DBMS that can withstand failures of disks, servers, racks, and data centers without compromising consistency. To ensure fault tolerance, a proprietary distributed consensus algorithm is used, as well as a number of technical solutions, which are discussed in detail in the report. The report may be of interest to both DBMS developers and developers of applied solutions based on DBMS.

Semyon Checherinda - Distributed transactions in YDB

19 hydra heads. Great overview of the program Semyon is a developer in the distributed platform group at Yandex, working on the possibility of multi-tenant use of the YDB installation.

Yandex Database is designed for OLTP queries and meets the ACID requirements for a transactional system. In the report, we will consider the transaction scheduling algorithm underlying the YDB transactional system. Let's analyze which entities participate in transactions, who assigns a global order to transactions, how transaction atomicity, reliability and strict isolation level are achieved. Using the example of a common task, let's consider the implementation of transactions using a two-phase commit and deterministic transactions. Let's discuss their differences.

What's next?

The conference program continues to be filled with new reports. In particular, we expect a report from Nikita Koval (ndkoval) from JetBrains and Oleg Anastasiev (m0nstermind) from Odnoklassniki. Nikita works on algorithms for coroutines in the Kotlin team, and Oleg develops architecture and solutions for high-load systems in the Odnoklassniki platform. In addition, there is 1 more conditionally empty slot, with candidates for which the program committee is working right now.

The Hydra conference will take place on July 11-12 in St. Petersburg. Tickets can be buy on the official website. Pay attention to the availability of Online tickets - if for some reason you cannot get to St. Petersburg live these days.

See you at Hydra!

Source: habr.com

Add a comment