Memoirs: Red Hat
October 26, 2020After the collapse of SiCortex, I got to experience modern unemployment. Among other things, this meant using COBRA for health insurance, and seeing how much that cost. I'm doing the same now BTW, even though this time the separation was voluntary. Another thing I did was interview at lots of places. I remember one of them was Iron Mountain, where a friend (from Dolphin and EMC) as a VP and they were trying to get into digital archiving as well as physical. But I took one look at the enterprise-Java thing they were building and politely ran away; it collapsed and IM got out of that business about a year later.
The one interview that really "clicked" was at Red Hat. They wanted someone to develop a "cloud filesystem" but were very carefully not saying what that was. The official line is that part of my role was to define it. One of the rules I was told about early in my time there was that nobody could ever spare time to help you when you were asking questions, but they'd sure tell you how you'd gotten it wrong after you'd come up with your own answers and done all the work. Too true, and not just at Red Hat. I guess it's a variation on the common adage that the fastest way to get a right answer on the internet is to post a wrong one. This was in 2009, when cloud stuff in general and cloud storage in particular were still pretty new. For example, OpenStack Swift wouldn't exist for another year. The project seemed interesting, so I joined.
After a couple of months, my ideas of what a cloud filesystem should be converged on what we would now recognize as multi-tenancy. Each user could have a separate namespace, a separate UID space, separate capacity and I/O quotas. In other words, the illusion of a private filesystem but with only one actual filesystem for administrators to deal with, plus all the efficiency and "elasticity" of not fragmenting resources across users. The initial name for the project was CloudFS, but we couldn't get a trademark for it. Apparently (I didn't find this out until later) that was because VMware had also called something CloudFS that wasn't even within artillery distance of an actual filesystem. People tacking "FS" onto things that aren't actually filesystems has always pissed me off. If you can't or won't do the hard bits, don't use the name. We went searching for a new name and came up with HekaFS because "heka" is an Egyptian word for magic. I never liked the name all that much, but I hated dealing with that stuff so I just rolled with it.
Having worked on creating two distributed filesystems before (Mango's Medley and EMC's MPFS), and been exposed to a couple more that were already complete (Lustre and PVFS2), I wasn't actually all that interested in starting from scratch this time. I wanted to use an existing distributed filesystem as a base, and that's what led me to Gluster. Of all the options, it was by far the most modular. This was important not only from a technical perspective, but also because it meant we could license and release and support HekaFS as separate modules. It wasn't to do with the technical quality or vision; if it had been I might have chosen Ceph instead. Or not. Unlike Ceph at the time, Gluster already was an actual filesystem. It also didn't embody the metadata-server mistake made by Lustre (most made again later by HDFS), and didn't just throw away data like MooseFS. That was pretty much the landscape at the time, so Gluster it was.
At first, I was the outsider on Gluster. It had originally been developed by a team in and/or from Bangalore (some had moved to Sunnyvale) and I was the only person outside of Gluster Inc. or Gluster Ltd. doing serious development work on it. This meant a lot of interaction via email and IRC, which was fine. What was less fine was how long it sometimes took to get answers, or how those answers would turn out to be out of date a week later because somebody at Gluster hadn't seen fit to share their plans for a rewrite of the part I was asking about. This tendency to hold things too closely continued to be a problem even much later when we were all (supposedly) part of the same team. I chalk it up to a culture/personality thing.
A lot of people still seem to think I was a significant player in Red Hat acquiring Gluster, or later Ceph. That's really not the case. I was briefly involved at the very beginning, enough to tell people everything I knew and to come up with the name for the M&A project ("matte" because it's an antonym for "lustre"). Then, by mutual agreement, I was completely cut out. They didn't need an advocate; I didn't need the hassle. That's the way these things actually work. I heard occasional hints that let me know things were still moving along, but the first time I heard about the actual acquisition it was on Twitter. There had apparently been an email, but I didn't see that until a couple of hours later.
With everyone now on the same team, the reasons for maintaining HekaFS separate from GlusterFS no longer applied. Accordingly, I was absorbed into the Gluster team, where I continued in a senior role and eventually became a project maintainer. I had no idea when I started at Red Hat that I'd be there twice as long as I'd been anywhere else (EMC previously having been longest at about four years). Ultimately, other priorities absorbed so much of my time that everything that had been part of HekaFS kind of died. So did a lot of my other initiatives, mostly around improving scalability. For example, twice I tried to replace Gluster's inefficient and inconsistency-prone replication system ("AFR") with something better. Both times, code got repeatedly stalled by hostile code reviews until I and everyone else working on the effort got pulled away by other fires. There were a lot of political games like that. In between fixing hundreds of bugs and adding some minor features and for a long time being the most active code reviewer on the project, my most significant contribution was probably "brick multiplexing" right at the end of my Red Hat tenure. This was a feature to let multiple "bricks" (storage servers) coexist within one process instead of each exacting its own (large) per-process memory overhead and consuming its own port and contending with others for CPU timeslices. By this time Red Hat was utterly consumed with container-mania, and this feature allowed us to support container workloads much more efficiently.
The other thing about my time on Gluster was the opportunities it gave me for travel. I went to Bangalore several times, which is a life experience I'll always treasure. I also got to attend "summits" in Barcelona, Berlin, and Prague - all of which were great in their own ways. There were also multiple instances of FUDcon and FAST and others, which weren't in quite such exciting places but were still very satisfying. For all of its reputation for frugality, Red Hat was actually pretty good to me when it came to travel.
But all good things must come to an end. I was frustrated with the politics within the Gluster team. I was even more frustrated about Gluster constantly being minimized and disrespected in favor of Ceph despite the fact that we were the ones actually bringing in revenue. But mostly I was frustrated with the direction Red Hat was trying to take Gluster. There was a huge focus on containers, and on NFS/SMB service. Also, a frighteningly high percentage of Gluster installations - at least those people were paying Red Hat to support - were a mere handful of nodes. Those weren't the problems or environments I was interested in. My interests were mostly around greater scale and performance, but my efforts there were routinely thwarted. That wasn't a fight I wanted to keep having.
As it turns out, there was one company that was already running Gluster at far higher scale than anyone else, and was interested in pushing that even further. I'd met some members of the team at those aforementioned summits, and been very impressed by them. That company, of course, was Facebook.