Theodore Ts'o's Notes on the Kernel Linux, code of conduct, ext4, btrfs and ZFS

Translation of thoughts by Theodore Ts'o, creator of the Ext4 file system, about the development of ext4, the BcacheFS file system, and the kernel Linux, ZFS, the code of conduct and file systems in general:

About ext4 development.

More than half a dozen people contribute to each ext4 kernel release. Currently, most of my time is spent reviewing code, running tests, and improving the test application.sq,gce,qemu,android}-xfstests. And I rely heavily on two or three other developers working at SUSE and IBM who help me with code reviews.

About BcacheFS

To be fair, bcachefs isn't a completely solo project - for example, Kent was the author of 72% of the patches between the 6.11 and 6.12 kernel releases, while of the 103 ext4 patches in the same time period, I was the author of exactly 0%. This is because I'm a firm believer that programming is a team sport, and my job as tech lead is to enable ext4 contributors to do their best to improve the filesystem. We have weekly conferences, and Darrick Wong, the senior XFS developer and former XFS maintainer, attends those conferences - and I've been known to help him with XFS testing issues, and Darrick has been known to help me with various ext4 testing issues, and even reviewed a couple of ext4 patches. We collaborate with each other, and that's a good thing.

I'll leave it up to other people to decide whether they want to trust their data to someone who is a lone hot programmer who may well be more talented than me, but I'll give you a hint - you can "cheat" by involving a team in solving the problem. You don't have to do it alone. Of course, to do this you need to know how to bring out the best in others, and you need to work together. And being polite to each other on mailing lists doesn't hurt.

About the kernel, CoC, capabilities and future of ext4

Ext4 does get some new features, but these are the ones that companies are willing to fund because the ROI of developing the feature makes sense from a cost-benefit perspective. For example, fscrypt and case-insensitive directories were features that were useful for Android and Chrome OS, and were funded, at least in part, by these development teams (Steam was also concerned about case folding and supported one of the engineers). We want to add support for untortured writes because it will improve database performance on cloud-based emulated block devices, where 16k atomic writes can be guaranteed, eliminating double-buffering in MySQL and PostgreSQL.

(Actually, Amazon and Google can do this in their own database products by making assumptions about how Amazon EBS and Google Persistent Disk work, but we want to do it in a more general way that will be more maintainable in the long run.) It's less sexy than things like reflinks, but the ROI is much easier to justify, both because the costs are lower (less development, testing, and qualification work for enterprise deployment) and because the benefits are much easier to quantify. Things like "I can save the cost of salaries of XX full-time software engineers over five years" are much easier to do for these kinds of performance-enhancing features.

By contrast, reflinks are fun, but I couldn't find a customer willing to pay for the development costs, or a company that thought their customers would buy more of their product if they added reflinks to ext4. This might sound terribly corporate, but there's a story about how ZFS engineers started a project from scratch without asking management for permission or getting input from sales, and pitched Sun what was effectively a fait accompli.

Sounds great, but remember that Sun ended up losing money until it was forced to sell itself to another company, and the engineering organization that supported ZFS no longer exists. Around the time ZFS was announced, I was involved in a company-wide study to determine whether it made sense to invest in file system features for AIX and Linux — and we came to the conclusion that no, the return on investment is small, and the new file system features won't lead to more customers buying IBM hardware, software, or systems. IBM may have fallen on hard times, but it's still around, and Sun isn't.

Around the same time, representatives of several Linux-companies got together to figure out how Linux will compete with ZFS. It was at this meeting that the idea was put forward that btrfs would be the long-term answer, and ext4 would be the short-term solution, which would provide support for things like live resizing, 64-bit block numbers, and other features found in traditional Legacy Unix OSes that ext3 lacked.

At that meeting, I was asked to estimate what it would take to build a completely new file system. I did some research, looking at how much effort it took to build file systems like IBM's GPFS and JFS, Digital's advfs, and I estimated how much it took Sun to build ZFS and get that file system to a production-ready state. The answer I got was about 100 person-years, with one low estimate of 50 person-years and a high estimate of 200 person-years (but that was for GPFS, which was a clustered file system, and therefore much more complex).

I mentioned this in a meeting, and some senior engineer at Intel said, "No, don't tell the executives that because they'll never approve the project! Tell them btrfs will be ready in 18 months." I'll let people decide for themselves when btrfs reaches "enterprise ready" status, especially for those sexy new advanced features that were supposed to compete with ZFS, but I don't think it's negotiable that it wasn't in 18 months.

Even before Sun disbanded, many of the companies that sent representatives to the meeting refused to have their engineers participate in btrfs, which, of course, didn't help. But this was probably because companies are rational organizations that make their own decisions about return on investment, and funding a new file system didn't make as much sense as telling people what Linux there will be a response to ZFS.

In retrospect, while ZFS had these really cool features, they weren't enough to make most users choose Solaris over buying much cheaper x86 platforms and installing LinuxAnd by the time Sun decided to try the OpenSolaris and Solaris x86 strategy, it was too late. The network effects were enormous, and the x86 strategy didn't answer the question of how one company, Sun, could pay the salaries of all the super-talented engineers working on Solaris. Buying an x86 server for $5000 doesn't provide a high return on sales compared to server The $100000 SunFire E10k Sparc, which Sun called the "dot" in "dot Com."

The point is that engineering in the real world is a trade-off, and business realities are part of that trade-off. I make no apologies for the fact that I prefer to eat food, and that I want to make enough money to retire someday. And that in turn means I have to be very clear about how I'm providing value to my employer that's at least 10 times my salary. If I can do that while still contributing to open source and helping other companies make money so they'll be willing to contribute to ext4, well, that's part of the challenge, and part of why I love working in open source.

And, returning to the Code of Conduct, I'll say that almost all the maintainers of the major file systems supported the Code not out of some weak liberal considerations. It's because we need every engineer willing to contribute to our project, and most of us have seen people who refused to work in Linux and switched to other operating systems (I know one person who switched to Windows and was a valuable kernel developer Linux at IBM Linux Technology Center) or worked on internal projects, but not anything that required interaction with LKML, due to the toxic environment of several people on the mailing list.

In some cases, the fears were unfounded; for example, Linus yelled at a senior developer who really should have known better, and with whom Linus had met in person in most cases and had an established relationship. The problem is that new engineers didn't know this, and were afraid of "what if Linus humiliates me in public like he did to Steve", not realizing that in practice, that's not going to happen. That's why we have CoC; it's not for us senior engineers, it's to support the younger engineers on our teams who we want to train so that they can eventually replace us when it's time for us to retire, or get hit by a bus, or otherwise pass away from this mortal coil.

Don't forget about the 50-100 person-years of work it takes to build a file system that's enterprise-ready. We need all the engineers we can get, and many of us do extra work in our spare time because we care. Building a high-quality file system is a team effort, and we need every talented engineer we can get. Even if one engineer is a super 10x programmer, if that engineer ends up scaring off a bunch of other engineers who could be working on testing, performance tuning, etc., it's just not worth letting someone be a jerk.

Source: opennet.ru

Buy reliable hosting for sites with DDoS protection, VPS VDS servers 🔥 Buy reliable website hosting with DDoS protection, VPS VDS servers | ProHoster