“It’s easier to answer than to remain silent” - a great interview with the father of transactional memory, Maurice Herlihy

Maurice Herlihy - the owner of two Dijkstra Prizes. The first is for work "Wait-Free Synchronization" (Brown University) and the second, more recent, - "Transactional Memory: Architectural Support for Lock-Free Data Structures" (Virginia Tech University). The Dijkstra Prize is given for works whose significance and influence have been noticeable for at least ten years and, obviously, Maurice is one of the most famous specialists in the field. He is currently a professor at Brown University and has paragraph-long accomplishments. Now he is engaged in blockchain research in the context of classical distributed computing.

Previously, Maurice has already come to Russia for SPTCC (video recording) and made an excellent meeting of the JUG.ru Java developer community in St. Petersburg (video recording).

This habrapost is a great interview with Maurice Herlihy. It discusses the following topics:

  • Interaction between academia and industry;
  • Foundation for blockchain research;
  • Where do breakthrough ideas come from? Influence of popularity;
  • PhD under the guidance of Barbara Liskov;
  • The world is waiting for multi-core;
  • New world, new problems. NVM, NUMA and architecture hacking;
  • Compilers vs CPUs, RISC vs CISC, shared memory vs message passing;
  • The art of writing fragile multi-threaded code;
  • How to teach students how to write complex multi-threaded code;
  • New edition of the book "The Art of Multiprocessor Programming";
  • How was transactional memory invented?   
  • Why it is worth doing research in the field of distributed computing;
  • Has the development of algorithms stopped, and how to live on;
  • Work at Brown University;
  • The difference between university and corporate research;
  • Hydra and SPTDC.

Interviews are conducted by:

Vitaly Aksenov — currently a post-doc at IST Austria and an employee of the Computer Technologies Department at ITMO University. He is engaged in research in the field of theory and practice of competitive data structures. Prior to joining IST, he received his PhD from Paris Diderot University and ITMO University under Prof. Petr Kuznetsov.

Alexey Fedorov is a producer at JUG Ru Group, a Russian company that organizes conferences for developers. Alexey participated in the preparation of more than 50 conferences, and his resume contains everything from the position of a development engineer at Oracle (JCK, Java Platform Group) to the position of a developer at Odnoklassniki.

Vladimir Sitnikov is an engineer at Netcracker. For ten years he has been working on the performance and scalability of NetCracker OS, software used by telecom operators to automate network and network equipment management processes. Interested in Java and Oracle Database performance issues. Author of over a dozen performance improvements in the official PostgreSQL JDBC driver.

Interaction between academia and industry

Alexey: Maurice, you have been working in academia for a very long time and the first question is about the interaction between academia and industry. Could you tell us how the interactions between them have changed lately? What was 20-30 years ago and what is happening now? 

Maurice: I have always tried to work closely with commercial companies because they have interesting challenges. As a rule, they are not very interested either in publishing their results or in detailed explanations of their problems to the world community. They are only interested in solving these problems. I worked for some of these companies for a while. I spent five years working full-time in a research lab at Digital Equipment Corporation, which used to be a major computer company. I worked one day a week at Sun, at Microsoft, at Oracle, worked a little at Facebook. Now I'm going to go on a sabbatical leave (a professor at an American university is allowed to take such a vacation for a year about once every six years) and work in Algorand, this is such a cryptocurrency company in Boston. Working closely with companies has always been a pleasure, because that's how you learn about new and interesting things. You can generally be the first or second person to publish an article on a chosen topic, instead of gradually improving solutions to problems that everyone else is already working on.

Alexey: Can you tell us more about how this happens?

Maurice: Of course. You know, when I was at Digital Equipment Corporation, me and Elliot Moss, we invented transactional memory. It was a very fruitful period when everyone started to be interested in information technology. Concurrency included, although multi-core systems did not yet exist. In the days of Sun and Oracle, I did a lot of work on parallel data structures. At Facebook, I was involved in their blockchain project, which I can't talk about but hopefully goes public soon. Next year, at Algorand, I will be working on a research team studying smart contracts.

Alexey: In the last few years, blockchain has become a very popular topic. Will it help your research? Perhaps it will make it easier to obtain grants or give access to the resources of companies operating in the industry?

Maurice: I have already received a small grant from the Ethereum Foundation. The popularity of blockchain is very useful for inspiring students to work in this field. They are very interested in it and happy to be involved, but sometimes they don't realize that research that sounds tempting on the outside turns out to involve really hard work. However, I am very happy to use all this mystique around the blockchain, it helps to attract students. 

But that is not all. I am on the advisory board of several blockchain startups. Some of them may succeed, some of them may not, but it's always very interesting to see their ideas, study them and advise people. The most exciting thing is when you warn people not to do something. A lot of things seem like a good idea at first, but are they really?

Foundation for blockchain research

Vitaly: Some people think that blockchain and its algorithms are the future. And other people say it's just another bubble. Can you share your opinion on this matter?

Maurice: A lot of what's going on in the blockchain world isn't working properly, some are just scams, a lot of things are overrated. However, I think there is a solid scientific basis for these studies. The fact that the blockchain world is full of ideological divisions shows the level of excitement and dedication. On the other hand, it is not particularly beneficial for scientific research. Now, if you publish an article that talks about the shortcomings of a particular algorithm, the reaction received is not always fully scientific. Often people express their emotions. I think that such a hype in this area may seem attractive to some, but in the end, there are real scientific and engineering issues that have yet to be addressed. There is a lot of Computer Science here.

Vitaliy: So you are trying to lay the foundation for blockchain research, right?

Maurice: I'm trying to lay the foundation for a solid, scientifically and mathematically sound discipline. And part of the problem is that sometimes you have to contradict some of the overly harsh positions of other people, to ignore them. Sometimes people ask why I work in a field that only terrorists and drug dealers are interested in. Such a reaction is as meaningless as the behavior of followers blindly repeating your words. I think the truth is somewhere in the middle. Blockchain is yet to have a profound impact on society and the global economy. But, probably, this will not happen thanks to modern technology. Modern technologies will develop and what will be called blockchain in the future will become very important. Maybe it won't even look like modern blockchains, that's an open question.

If people invent new technologies, they will continue to call it blockchain. I mean, just like today's Fortran has nothing to do with the Fortran language from the 1960s, but everyone keeps calling it Fortran. The same for UNIX. What is called “blockchain” is yet to make its revolution. But I doubt that this new blockchain will be like what everyone loves to use today.

Where do breakthrough ideas come from? Influence of popularity

Alexey: Has the popularity of the blockchain led to new results from a scientific point of view? More interaction, more students, more companies in the area. Are there any results of this growth in popularity already?

Maurice: I became interested in this when someone handed me an official flyer for a company that had just raised quite a lot of money. She wrote about the task of the Byzantine generalswith which I am more than familiar. Written in the leaflet was clearly technically incorrect. The people who wrote this didn't really understand the model behind the problem... and yet this company raised a lot of money. Subsequently, the company quietly replaced this leaflet with a much more correct version - and I will not say what the name of this company was. They still exist and are doing very well. This case convinced me that, firstly, blockchain is just a form of distributed computing. Secondly, the entry threshold (at that time, four years ago) was quite low. People working in this area were very energetic and smart, but they did not read scientific papers. They tried to reinvent known things and they did it wrong. Today the drama has been reduced.

Alexey: It's very interesting, because a few years ago we had a different trend. It's a bit like front-end development, where browser interface developers reinvented entire technologies that were already popular in the back-end by that time: build systems, continuous integration, and stuff like that. 

Maurice: I agree. But this is not surprising, because truly breakthrough ideas always come from outside the established community. Established researchers, especially authorities in academia, are unlikely to do anything truly groundbreaking. It's easy to write a report for the next conference about how you slightly improved the results of your past work. Go to a conference, get together with friends, talk about the same things. And the people who break in with breakthrough ideas almost always come from outside. They don't know the rules, they don't know the language, but still... If you're inside an established community, I advise you to pay attention to new things, to something that doesn't fit into the big picture. In a sense, an attempt can be made to combine external, more fluid developments with techniques that we already understand. As a first step, try to create a scientific base, and then modify it so that it can be applied to new breakthrough ideas. I think that the blockchain is great for the role of a fresh breakthrough idea.

Alexei: Why do you think this is happening? Because people "outside" do not have any specific barriers inherent in the community?

Maurice: There's a pattern here. If you read the history of the Impressionists in painting and art in general, then at one time famous artists rejected impressionism. They said it was some kind of childishness. A generation later, this previously rejected art form became the standard. What I see in my field: the inventors of the blockchain were not interested in power, in winding up publications and citation index, they just wanted to do something good. And so they sat down and started doing it. They lacked a certain amount of technical depth, but that's fixable. It is much more difficult to come up with new creative ideas than it is to correct and amplify insufficiently mature ones. Thanks to these inventors, I now have something to do!

Alexey: This is similar to the difference between startups and legacy projects. We inherit a lot of thought limitations, barriers, special requirements, and so on.

Maurice: A good analogy is distributed computing. Think of blockchain as if it were a startup and distributed computing as a large established company. Distributed computing is in the process of being bought and merged with blockchain.

PhD under Barbara Liskov

Vitaliy: We still have a lot of questions! We have been researching your bio and came across an interesting fact about your PhD. Yes, it was a long time ago, but the topic seems to be important. You received your PhD under the supervision of Barbara Liskov! Barbara is very famous in the programming language development community, and a very famous person in general. It is logical that your research was in the field of programming languages. How did you switch to parallel computing? Why did you decide to change the subject?

Maurice: At the time, Barbara and her group were just looking at distributed computing, which was a very new idea. There were also those who said that distributed computing is nonsense, communication between computers is meaningless. One of the issues considered in distributed computing, which distinguishes them from centralized computing, is fault tolerance. After a lot of research, we decided that in a programming language for distributed computing, you need to have something like atomic transactions, because you can never be sure that a remote call will succeed. Once you have transactions, there is a problem of concurrency control. Then there was a lot of work on obtaining highly parallel transactional data structures. Then when I graduated I went to Carnegie Mellon and began to look for a topic for work. It occurred to me that computing had moved from individual computers to networks of computers. A natural continuation of progress would be multiprocessors - the word "multi-core" did not exist then. I thought: what is the equivalent of atomic transactions for a multi-core system? Definitely not ordinary transactions, because they are too large and heavy. And that's how I came up with the idea linearizability and that's how I came up with the whole wait-free sync. It was an attempt to answer the question of what is the analogue of atomic transactions for a multiprocessor system with shared memory. At first glance, this work may look quite different, but in fact it is a continuation of the same theme.

The world awaiting multi-core

Vitaly: You mentioned that there were very few multi-core computers at that time, right?

Maurice: They just didn't exist. There were several so-called symmetrical multiprocessors, which were basically connected to the same bus. It didn't work very well, because every time a new company created something like this, Intel released a single processor that outperformed the multiprocessor.

Alexei: Doesn't this mean that in those ancient times, it was more of a theoretical study?

Maurice: It was not a theoretical, but rather a speculative study. All this was not about working with a lot of theorems, rather, we put forward hypotheses about the architecture that did not exist at that time. That's what research is for! No company would have done this, it was all something from the distant future. In fact, this was until 2004, when real multi-core processors appeared. Due to the fact that processors overheat, you can make the processor even smaller, but you can not make it faster. Because of this, there was a transition to multi-core architectures. And then it meant that all of a sudden there was a use for all the concepts that we had developed in the past.

Alexey: Why do you think multi-core processors appeared only in the XNUMXs? So why so late?

Maurice: It's due to hardware limitations. Intel, AMD, and other companies are very good at boosting processor speeds. When at some point the processors got small enough that they couldn't increase the clock speed anymore because the processors would start to burn out. You can make them smaller, but not faster. What is in their power - instead of a very small processor, fit eight, sixteen or thirty-two processors in the same volume of the case, where only one used to fit. Now you have multithreading and fast communication between them because they share caches. But you can't make them run faster - there is a very specific speed limit. They continue to improve little by little, but not so much. The laws of physics got in the way.

New world, new problems. NUMA, NVM and architecture hacking

Alexei: Sounds very reasonable. With new multi-core processors came new problems. Did you and your colleagues expect these problems? Perhaps you have studied them in advance? In theoretical studies it is often not very easy to predict such things. When problems did occur, to what extent did they meet your and your colleagues' expectations? Or were they brand new and you and your colleagues had to spend a lot of time solving problems as they arose?

Vitaliy: I'll add to Alexey's question: did you correctly predict the architecture of processors while you were studying theory?

Maurice: Not all 100%. But I think my colleagues and I did a good job of predicting shared-memory multi-core. I think we correctly predicted the difficulties in designing parallel data structures that work without locks. Such data structures have been important for many applications, though not for all, but often you really need a lock-free data structure. When we invented them, many argued that this is nonsense, that everything works fine with locks. We foresaw quite well that there would be ready-made solutions to many programming problems and data structure problems. There were also more complex problems, such as NUMA – Uneven memory access. In fact, they weren't even considered until the invention of multi-core processors because they were too specific. The research community worked on questions that were generally predictable. Some hardware problems associated with specific architectures had to wait in the wings - in fact, the appearance of these architectures. For example, no one really worked on GPU-specific data structures because the GPU didn't exist back then. Although much work has been done on SIMD, these algorithms were ready for use as soon as the right hardware appeared. However, it is impossible to predict everything.

Alexey: If I understand correctly, NUMA is a kind of compromise between cost, performance and some other things. Any idea why NUMA came so late?

Maurice: I think NUMA exists because of a problem with the hardware used to make memory: the farther away the components are, the slower they are accessed. On the other hand, the second value of this abstraction is the uniformity of memory. Therefore, one of the characteristics of parallel computing is that all abstractions are slightly broken. If access were perfectly uniform, all memory would be equidistant, but this is economically, and maybe even physically impossible. So this conflict arises. If you write your program as if the memory were uniform, then most likely it will be correct. In the sense that it will not give wrong answers. But the performance of her stars from the sky will not grab. Similarly, if you write spinlocks without understanding the hierarchy of caches, the lock itself will be correct, but you can forget about performance. In a sense, you have to write programs that live on top of a very simple abstraction, but you have to outsmart the people who gave you that abstraction: you have to know that under the abstraction there is some hierarchy of memory, that there is a bus between you and this memory, and so on. Thus, there is some conflict between abstractions that are useful on their own, which leads us to very specific and pragmatic problems.

Vitaliy: What about the future? Can you predict how processors will develop further? There is an idea that one of the answers is transactional memory. You probably have something else in stock.

Maurice: There are a couple of major challenges ahead. One is that coherent memory is a wonderful abstraction, but it starts to break down in special cases. So, for example, NUMA is a living example of something where you can keep pretending that uniform memory exists. Actually - no, the performance will make you cry. At some point, architects will have to abandon the idea of ​​a unified memory architecture, you can't pretend forever. New programming models will be needed that are easy enough to use and powerful enough to make the underlying hardware efficient. This is a very difficult compromise, because if you show programmers the architecture that is actually used in hardware, they will go crazy. It's too complicated and not portable. If you present an interface that is too simple, performance will be poor. Thus, many very difficult compromises will need to be made in order to provide useful programming models applicable to really large multi-core processors. I'm not sure that anyone other than a narrow specialist is capable of programming on a 2000-core computer. And unless you're doing very specialized or scientific computing, cryptography or whatever, it's still not at all clear how to do it right. 

Another similar direction is specialized architectures. Graphics accelerators have been around for a long time, but have already become kind of a classic example of how you can take a specialized type of computation and run it on a dedicated chip. This adds its own challenges: how do you communicate with such a device, how do you program it. I recently worked on tasks in the field near memory computing. You take a small processor and glue it to a huge chunk of memory so that the memory runs at L1 cache speed, and then it communicates with a device like TPU - the processor is busy loading new tasks into your memory core. The development of data structures and communication protocols for this sort of thing is another interesting example. Thus, specialized processors and hardware will be subject to improvements for quite some time.

Alexey: What about non-volatile memory (non-volatile memory)?

Maurice: Oh, that's another great example! NVM will greatly change the way we look at things like data structures. Non-volatile memory, in a sense, promises to really speed things up. But it won't make life easier, because most processors, caches, and registers are still volatile. When you start up after a crash, your state and your memory state will not be exactly the same as before the crash. I am very grateful to the people involved in NVM - for a long time, researchers will have something to do, trying to figure out the correctness conditions. Computations are correct if they can survive a crash in which the contents of the caches and registers are lost, but the main memory remains intact.

Compilers vs CPUs, RISC vs CISC, shared memory vs message passing

Vladimir: What do you think about the compilers versus processors dilemma in terms of the instruction set? To explain for those who are not in the subject: if we go to uneven memory or something like that, we could apply a very simple set of instructions and ask the compiler to generate complex code that can take advantage of the benefits. Or we can go the other way: implement complex instructions and ask the processor to reorder the instructions and do other manipulations with them. What do you think about it?

Maurice: I don't really have an answer to that question. This debate has been going on for four decades. There was a time between abbreviated command set and complex civil wars were waged by a set of teams. For a while, the RISC people won, but then Intel rebuilt their engines so that a reduced instruction set was used inside, and the full one was exported outside. Perhaps this is a topic in which each new generation must find its own compromises and make its own decisions. It is very difficult to predict which of these things will turn out better. So any prediction I make will be true for a certain amount of time, and then become false again for a while, and then be true again.

Alexey: How common is it for the industry in general that some ideas win over several decades and lose in the next? Are there other examples of such periodic changes?

Maurice: In the field of distributed computing, there are people who believe in shared memory and people who believe in message exchange. Originally in distributed computing, parallel computing means message passing. Then someone discovered that shared memory made programming much easier. The other side said that shared memory is too complicated, because they need locks and the like, so it's worth moving to languages ​​where nothing but message passing simply exists. Someone looked at what came out of it and says: “wow, this messaging implementation looks very similar to shared memory, because you create many, many of these little modules, they send messages to each other, and they all deadlock, - let's make a shared memory database better!". All this is repeated over and over again, and it is impossible to say that one of the parties is unambiguously right. One side will always dominate, because as soon as one of them almost wins, people again and again invent ways to improve the other.

The art of writing brittle multi-threaded code

Alexei: This is very interesting. For example, when we write code, no matter what programming language, we usually have to create abstractions like cells that can be read and written. But in fact, at some physical level, it may look like sending a message on a hardware bus between different computers and other devices. It turns out that there is work at both levels of abstraction at once.

Maurice: It's absolutely true that shared memory is built on message passing - buses, caches, and so on. But it's hard to write programs using message passing, so the hardware deliberately lies, pretending you have some kind of uniform memory. This will make it easier for you to write simple, correct programs before performance starts to drop. Then you say: looks like it's time to make friends with the cache. And that's when you start to worry about the location of the cache, and then off we go. In a sense, you're breaking the abstraction: you know it's not just flat, uniform memory, and you're going to use that knowledge to write cache-friendly programs. This is what you have to do in real tasks. This conflict between the nice simple nice abstraction you were given and the terribly complex implementation of the underlying hardware is where everyone makes their own compromise. I have a book on multiprocessors and synchronization, and one day I was going to write a chapter on data structures in java.util.concurrent. If you look at them, things like skip lists These are amazing works of art. (Editor's note: those who are familiar with the Java language should at least take a look at the implementation ConcurrentSkipListMap, You can look at the links for API и source code). But from my point of view, it would be irresponsible to show them to students, because such a data structure is a kind of guy in a circus, running on a tightrope over a bear pit. If you change even one small detail, the whole structure will collapse. This code is very fast and elegant just because it is perfectly written, but the slightest change will lead to complete failure. If I give this code as an example to students, they will immediately say: I can do this too! And then some plane will crash or a nuclear reactor will explode, and it will be my fault that I did not give them too much information at the right time.

Alexey: When I was a little younger, many times I tried to study the source code of Doug Lee, for example, java.util.concurrent, because it's open source, it's very easy to find it and try to understand what's going on there. It didn’t turn out very well: often, it’s completely unclear why Doug decided to do something in this way, when everyone else does it differently. How do you explain these things to your students? Is there a particular correct way to describe the specific details of a hardcore algorithm, for example? How do you do it?

Maurice: Drawing teachers have a cliché that they remember first: if you want to draw like Picasso, you must first learn how to draw simple realistic pictures, and only when you know the rules can you start breaking them. If you immediately start by breaking the rules, you get a mess. First, I teach students how to write simple, correct code without worrying about performance. I'm saying there are complex timing issues lurking here, so don't worry about caches, don't worry about memory models, just make sure everything works properly. This is hard enough already: modern programming is not easy in itself, especially for new students. And when they have an intuition about how to write correct programs, I say: look at these two spinlock implementations: one is very slow, and the second is also not very good, but already better. However, mathematically these two algorithms are the same. In fact, one of them uses cache locality. One of them spins on locally cached data, and the other repeatedly performs operations going through the bus. You can't write efficient code if you don't understand it, if you don't know how to break the abstraction and look at the underlying structure. But you won't be able to start doing it right away. There are people who start doing this right away and believe in their own genius, usually it ends badly because they do not understand the principles. No one draws like Picasso or writes programs like Doug Lee, fresh out of university, in his first week. It takes years to reach this level of knowledge.

Alexey: It turns out that you divide the problem into two parts: the first is correctness, the second is performance?

Maurice: Exactly. And, in that order. Part of the problem is that new students don't realize that correctness is hard to achieve. They say at first glance: this is obviously correct, it remains only to speed it up. So sometimes I tell them about an inherently incorrect algorithm as if it were correct.

How to teach students how to write complex multi-threaded code

Alexei: Just to see if they can sense the trick?

Maurice: I always warn you in advance that sometimes I will come up with the wrong algorithms. You shouldn't deceive people. I suggest they be skeptical about the information. If I tell something and say: “look, this is obviously correct” - this is a signal that somewhere they are trying to deceive you, and you should start asking questions. Next, I try to encourage students to keep asking questions, and then prompt: “what happens if we leave everything as it is?”. And they immediately see the error. But convincing students that they need to worry about correctness is more difficult than it seems at first glance. Many of these students come with programming experience in high school, some have already landed jobs and programmed there, and they are all full of self-confidence. This is something military: you first have to change their mindset in order to convince them to patiently approach the solution of emerging problems. Or maybe it's like Buddhist monks: first they learn to reason about correctness, and once they understand the ways of reasoning about correctness, they are allowed to go to the next level and start worrying about performance.

Alexey: That is, sometimes you show students non-working examples, thanks to which you get feedback showing whether they understand the essence of the problem, whether they can find the wrong code and the wrong result. Well, how do students usually please or upset?

Maurice: Almost always the students eventually find the mistake. If they search too slowly, I ask leading questions, and here it is important to understand that if they are never deceived, they will begin to thoughtlessly perceive your words as the ultimate truth. Then they get bored and fall asleep reading Facebook on their laptop during class. But when you let them know in advance that they are going to be scammed and that they will look stupid if they don't feel the trick, they become much more vigilant. This is good in many ways. I would like students not only to question their understanding of the issue, but also to question the authority of the teacher. The idea is that the student can raise their hand at any time and say: I think what you just said is wrong. It is an important learning tool. I don’t want any of the students to sit and silently think to themselves: all this seems complete nonsense, but it’s too scary to raise your hand, and indeed, he’s a professor, so everything he says is true. Therefore, if they are warned in advance that not everything told is necessarily true, they have an incentive to pay more attention to the material. I explicitly state that raising your hand and asking questions is okay. Your question may sound silly or naive, but that's often how the best questions come about.

Alexei: Very interesting. Usually people have some kind of psychological barrier that prevents them from asking the professor a question. Especially if there are a lot of people in the room, and everyone is afraid that discussing your stupid question will take up the time of all these people. Are there any tricks to deal with this?

Maurice: I often stop and ask the classic questions. Will any statement be correct, or how would they solve the problem under discussion. This is a key step, especially at the beginning of a session, when people are embarrassed to say even the smallest thing. You ask the students a question and say nothing more. There is silence, everyone tenses up a little, the tension grows, then suddenly someone breaks down, breaks down and says the answer. So you unfold the situation: it becomes more difficult and uncomfortable to remain silent than to answer! This is a standard pedagogical trick. Every teacher in the world should know how to do this.

Alexey: Now we have a great title for this interview: "it's easier to answer than to remain silent."

Vitaly: Let me ask you one more thing. You are working on topological proofs. How did you even get involved in this, because distributed computing and topology are completely different things!

Maurice: There is a hidden relationship there. When I was a student and studied mathematics, I studied pure mathematics. I had no real interest in computers until the end of my studies and I found myself in the urgent need to look for a job. As a student, I studied algebraic topology. Many years later, while working on a problem called "k-Set Agreement Problem", I used graphs to model the problem and, as it seemed then, found a solution. You just had to sit down and go around the graph. Try to find a suitable answer on this graph. But my algorithm did not work: it turned out that he would always run in circles. Unfortunately, none of this could be explained in the formal language of graph theory, the language that all computer scientists know. And then I remembered that many years ago, even in topology classes, we used the concept "simplicial complex", which is a generalization of graphs to higher dimensions. Then I asked myself: what happens if we reformulate the problem in terms of simplicial complexes? This became the key. By using a more powerful formalism, the problem suddenly becomes much simpler. People struggled with it for a long time, using graphs, but they could not do anything. And even now they cannot - the correct answer was not the algorithm, but the proof of the impossibility of solving the problem. That is, such an algorithm simply does not exist. But every proof of impossibility is based either on simplicial complexes, or on things that people pretend not to consider simplicial complexes. From the fact that you called something by a new name, it does not lose its essence.

Vitaliy: It turns out that you were just lucky?

Maurice: In addition to luck, it is also readiness. Meaning, you shouldn't forget the "useless" things you've learned before. The more useless things you learn, the more insights you will be able to extract when faced with a new problem. This kind of intuitive pattern matching is important because... Let's just say, it's a chain: in the beginning, I found that graphs didn't quite work or didn't work at all, it reminded me of something from eight years ago and student years when we studied all these simplicial complexes . In turn, this allowed me to find my old topology textbook and load it back into my head. But if it wasn't for that old knowledge, I would never have made any headway in solving the original problem.

New edition of The Art of Multiprocessor Programming

Alexei: You said a few words about your book. It's probably not the biggest secret that you wrote the world's most famous book on multithreading, "The Art of Multiprocessor Programming". She is already about 11 years old and since then only came out  revised reprint. Will there be a second edition?

Maurice: It's good that you asked! It will be very soon, in three months or so. There are two more authors, we added a lot more material, improved the section on fork / join parallelism, wrote a section on MapReduce, added a lot of new things and threw out unnecessary ones - something that was very interesting at the time of writing the first edition, but is no longer today. It turned out to be a very seriously revised book.

Alexei: Everything has already been done, it remains only to release?

Maurice: A couple of chapters still need to be worked on. Our publisher (I think he hates us already) is still trying to convey that we should work faster. We are way behind schedule. Theoretically, we could have done this book a couple of years earlier.

Alexey: Is there any chance to get a new version of the book before Christmas?

Maurice: That is our goal! But I have predicted victory so many times that no one believes me anymore. You probably shouldn't trust me too much in this matter either.

Alexei: In any case, this is fantastic news. I really liked the first edition of the book. You could say I'm a fan.

Maurice: I hope the new edition will be worthy of your fervent enthusiasm, thank you!

How transactional memory was invented

Vitaly: The next question is about transactional memory. As far as I understand, you are a pioneer in this field, you invented it at a time when no one thought about such things. Why did you decide to move into this area? Why were transactions important to you? Did you think that someday they will be embodied in iron?

Maurice: I've known about transactions since my graduate studies.

Vitaliy: Yes, but these are different transactions!

Maurice: I worked with Elliott Moss on non-blocking garbage collection. Our problem was that we wanted to atomically change a few words in memory and then the algorithms would become very simple, and at least some of them would become more efficient. Using compare-and-swap for load-link/store-conditionalprovided by the parallel architecture, it is possible to do something, but it is very inefficient and ugly because you would have to deal with levels of indirection. I want to change the memory words and I need to switch because I can only change one pointer, so they need to point to some kind of directory-like structure. We talked about how great it would be if we could change the hardware so that it could record simultaneously. Elliot seems to have noticed this: if you look at the cache coherency protocols, they already provide most of the required functionality. In an optimistic transaction, the cache coherency protocol will notice the presence of a timing conflict and the cache will become invalid. What happens if you speculatively start a transaction on your cache and use the mechanisms of the coherence protocol to detect conflicts? The speculative hardware architecture was easy to design. So we wrote that very first publication about transactional memory. At the same time, the company I worked for, Digital Equipment Corporation, was building a new 64-bit processor called the Alpha. And so I went and gave a presentation to the Alpha development team about our wonderful transactional memory, and they asked: what additional income will our company get if we put all this directly into the processor? And I had absolutely no answer to that, because I'm a technologist, I'm not a marketing specialist. I really had nothing to say. They weren't very impressed that I didn't know anything.

Vitaly: Billions! Just say "billions"!

Maurice: Yes, that's what I should have said. Now, in the era of startups and all that, I know how to write a business plan. That you can lie a little about the size of the potential profit. But in those days it seemed naive, so I just said, "I don't know." If you look at the history of the publication about transactional memory, you will notice that after a year there were several references to it, and then for about ten years no one cited this article at all. The quotes appeared around 2004, when true multi-core came into being. When people discovered that writing parallel code could make money, new research began. Ravi Rajwar wrote an article, which in some way introduced the mainstream to the concept of transactional memory. (Editor's Note: This article has a second version released in 2010 and is freely available as PDF). All of a sudden, people realized how exactly all this can be used, how they can speed up traditional algorithms with locks. A good example of something that in the past seemed to be just an interesting academic problem. And yes, if you had asked me at that time if I thought that all this would be important in the future, I would have said: of course, but when exactly is not clear. Maybe in 50 years? In practice, it turned out to be only a decade. It's very nice when you do something, and in just ten years people notice it.

Why it is worth doing research in the field of distributed computing

Vitaly: If we talk about new research, what would you advise readers - distributed computing or multi-core and why? 

Maurice: It's easy to get a multi-core processor these days, but it's harder to set up a true distributed system. I started working on them because I wanted to do something different from my PhD. This is the advice I always give to beginners: don't write a follow-up dissertation - try to go in a new direction. Plus, multithreading is easy. I can experiment on my own fork running on a laptop without getting out of bed. But if I suddenly wanted to create a real distributed system, I would have to do a lot of work, attract students, and so on. I am a lazy person and would prefer to work on multi-core. Experimenting with multi-core systems is also easier than experimenting with distributed ones, because even in a stupid distributed system there are too many factors to control.

Vitaliy: What are you doing now, researching blockchain? What articles should you pay attention to first?

Maurice: Recently appeared very good articlewhich I wrote with my student, Vikram Saraf, specifically for the Tokenomcs conferences in Paris three weeks ago. This is an article about useful distributed systems in which we propose to make Ethereum multi-threaded. Now smart contracts (code that runs on the blockchain) are executed sequentially. We wrote an article earlier that talked about a way to use speculative transactions to speed up the process. We took a lot of ideas from software transactional memory and said that if you make these ideas part of the Etherium virtual machine, then everything will work faster. But for this it is necessary that there are no data conflicts in the contracts. And then we assumed that in real life there really are no such conflicts. But we haven't had a chance to find out. Then it occurred to us that we had almost ten years of real contract history on our hands, so we unloaded the Etherium blockchain and wondered: what would happen if these historical records ran in parallel? We found a significant increase in speed. In the early days of Etherium, the speed increased very much, but today everything is somewhat more complicated, because there are fewer contracts and the likelihood of conflicts over data that requires serialization has become higher. But all this is experimental work with real historical data. The nice thing about blockchain is that it remembers everything forever, so you can go back in time and study what would happen if we used other algorithms to run the code. How people in the past would have liked our new idea. It is much easier and more pleasant to do such research, because there is a thing that monitors everything and records everything. This is already something more like sociology than the development of algorithms.

Has the development of algorithms stopped and how to live on

Vitaly: Time for the last theoretical question! Does it feel like advances in competitive data structures are shrinking every year? Do you think we have reached a plateau in our understanding of data structures, or will there be some major improvements? Maybe there are some clever ideas that can completely change everything?

Maurice: We may have reached a plateau in data structures for traditional architectures. But data structures for new architectures are still a very promising area. If you want to create data structures for, say, hardware accelerators, then GPU data structures are very different from CPU data structures. When you design data structures for blockchains, you need to hash pieces of data and then put them into something like merkle tree, to prevent counterfeiting. There has been a surge of activity in this area lately, many are doing a very good job. But I think what will happen is that new architectures and new applications will lead to new data structures. Older applications and traditional architecture - perhaps there is not much room for research anymore. But if you get off the beaten track and look over the edge, you'll see crazy things that the mainstream doesn't take seriously - that's where all the exciting stuff actually happens.

Vitaly: Therefore, to be a very famous researcher, I had to invent my own architecture 🙂

Maurice: You can "steal" someone else's new architecture - it seems much easier!

Work at Brown University

Vitaliy: Could you tell us more about Brown Universityin which you work? Not much is known about him in the context of information technology. Less than about MIT, for example.

Maurice: Brown University is one of the oldest universities in the United States. I think only Harvard is a little older. Brown is part of the so-called Ivy leagues, which is a collection of eight oldest universities. Harvard, Brown, Cornell, Yale, Columbia, Dartmouth, Pennsylvania, Princeton. This is a kind of old, small and a bit aristocratic university. The focus is on liberal arts education. It's not trying to be like MIT, MIT is very specialized and technical. Brown is a great place to study Russian Literature or Classical Greek and, of course, Computer Science. It focuses on comprehensive education. Most of our students go to Facebook, Apple, Google, so I think our students have no problem getting a job in the industry. I went to work at Brown because before that I worked at Digital Equipment Corporation in Boston. It was a company that invented a lot of interesting things, but denied the importance of personal computers. A company with a difficult fate, the founders of which were once once young revolutionaries, they did not learn anything and did not forget anything, and therefore they turned from revolutionaries to reactionaries within about a decade. They liked to joke that personal computers belonged in a garage—an abandoned garage, of course. It is quite obvious that they were destroyed by more flexible companies. When it became clear that the company was in trouble, I called my friend from Brown, who is about an hour from Boston. I didn't want to leave Boston at the time, because other universities didn't have that many vacancies. It was a time when there were not as many vacancies in the field of Computer Science as there are now. And Brown had a job, I didn't have to move out of my house, I didn't have to move my family, and I really enjoy living in Boston! So I made the decision to go to Brown. I like it. The students are great, so I never even tried to go somewhere else. On a sabbatical, I worked at Microsoft for a year, went to Technion in Haifa for a year, and now I'll be at Algorand. I have many colleagues everywhere and therefore the physical location of our classrooms is not so important. But the most important thing is the students, they are the best here. I have never tried to go anywhere else, because I am quite happy here.

Yet despite Brown's fame in the United States, he is surprisingly unknown overseas. As you can see, now I am doing my best to correct this state of affairs.

The difference between university and corporate research

Vitaliy: Okay, the next question is about Digital Equipment. You were a researcher there. What is the difference between working in the R&D department of a large company and working at a university? What are the advantages and disadvantages?

Maurice: I've been at Microsoft for twenty years, working closely with people at Sun Microsystems, Oracle, Facebook, and now Algorand. Based on all this, I want to say that it is possible to conduct first-class research both in companies and at the university. The important difference is that in a company you work with colleagues. If I suddenly have an idea for a project that does not yet exist, I have to convince my peers that this is a good idea. If I'm at Brown, then I can say to my students: let's work on antigravity! They will either go to someone else or take on the project. Yes, I will need to find funding, I will need to write a grant application and so on. In any case, there will always be many students, and you will be able to make decisions unilaterally. But at the university, you most likely will not work with people of your level. In the world of industrial research, you first have to convince everyone that your project is worth taking on. I can't order anything from anyone. And both of these ways of working are valuable, because if you're working on something really crazy and your colleagues are hard to convince, it's easier to convince graduate students - especially if you pay them. If you are working on something that requires a lot of experience and deep expertise, then you need colleagues who can say “no, it just so happens that I understand this area and your idea is bad, nothing will come of it.” This is very useful in terms of wasting time. And also, if in industrial laboratories you spend a lot of time writing reports, then at the university you spend this time finding money. If I want students to be able to travel somewhere, I have to find the money for it somewhere else. And the more important your position at the university, the more time you have to spend collecting money. So, now you know what I work as - a professional beggar! Like one of those monks who walk around with a donation plate. In general, these two activities complement each other. That is why I try to live and stand firmly in both worlds.

Vitaliy: It seems that convincing a company is more difficult than convincing other scientists.

Maurice: Harder, and much more so. Moreover, in different areas it is different: someone conducts full-scale research, and someone is focused on their topic. If I went to Microsoft or Facebook and said, let's do anti-gravity, they would hardly appreciate it. But if I said exactly the same thing to my graduate students, they would most likely get to work instantly, although now I would already have problems - because you need to find money for this. But as long as you want to do something in line with the goals of the company, that company can be a very good place to do research.

Hydra and SPTDC

Vitaliy: My questions are coming to an end, so let's talk a little about the upcoming trip to Russia.

Maurice: Yes, I am looking forward to returning to Petersburg.

Alexey: It is a great honor for me that you are with us this year. This is your second time in St. Petersburg, right?

Maurice: Already the third!

Alexei: Got it, but SPTDC - exactly the second. The last time the school was called SPTCC, now we have changed one letter (C to D, Concurrent to Distributed) to emphasize that there are more areas related to distributed computing this year. Can you say a few words about your presentations at the School and Hydra conferences?

Maurice: At the School, I want to talk about the basics of blockchain and what you can do with it. I would like to show that blockchains are very similar to the multi-threaded programming we are familiar with, but with their own nuances, and it is important to understand these differences. If you make a mistake in a normal web application, it's just annoying. If you write buggy code in a financial app, someone will definitely steal all your money. This is a completely different level of responsibility and consequences. I will talk a little about proof-of-work, smart contracts, transactions between different blockchains.

Other speakers will work next to me, who also have something to say about the blockchain, and we agreed to coordinate among ourselves so that our stories fit well. But for the engineering talk, I want to give a clear explanation to a wide audience why you should not believe everything you hear about blockchains, why blockchains are a great field, how it fits in with other well-known ideas, and why we should boldly look to the future.

Alexey: In addition, I want to say that this will not take place in the format of a meetup or a user group, as it was two years ago. We decided to do a small conference near the school. The reason is that after talking with Peter Kuznetsov, we realized that the school is limited to only a hundred, maybe 120 people. At the same time, there are a lot of engineers who want to talk with you, attend reports, and are generally interested in the topic. For this we have created a new conference called Hydra. By the way, any idea why Hydra?

Maurice: Because it will have seven speakers? And they can be cut off their heads, and new speakers will grow in their place?

Alexey: Great idea for growing new speakers. But really, there is a story here. Remember the legend of Odysseus, where he had to sail between Scylla and Charybdis? Hydra is something like Charybdis. The story is that once I spoke at a conference and talked about multithreading. There were only two tracks on this conference. At the beginning of the report, I told the audience in the hall that they now have a choice between Scylla and Charybdis. My spirit animal is Charybdis, because Charybdis has many heads, and my theme is multithreading. This is how the names of the conferences appear.

In any case, we have run out of both questions and time. So thank you friends for a great interview and see you at SPTDC and Hydra 2019!

It will be possible to continue communication with Maurice at the Hydra 2019 conference, which will be held on July 11-12, 2019 in St. Petersburg. He will come with a report "Blockchains and the future of distributed computing". Tickets can be purchased on the official website.

Source: habr.com

Add a comment