Memoirs: From Clam to Dolphin
October 13, 2020Word must have gotten around about the "success" of the 7135/REACT project, because we started getting requests to do similar things for other storage vendors. The first of these was Clariion, before it got slurped up by EMC. They wanted essentially the same functionality for their disk arrays, under the name ATF (Automatic Transparent Failover), so they contracted with us to do the AIX version and someone else to do the Solaris version. This ended up being pretty similar to REACT, minus some of the low-level cable issues and the fact that everyone was local. I had to drive out to Westborough (or maybe it was technically Southborough) pretty regularly, but that was only a bit further than driving in to Cambridge.
There was also nothing quite like the "pen in the fan" problem, but there were plenty of garden-variety things to debug, and one bug so weird that it's worth a paragraph or two. Clariion had some fancy management tools that ran on a host and used special SCSI commands to communicate with the disk array. During testing, we discovered that if we were in the middle of certain operations and someone logged in to the host with ssh then things would hang. WTF? These are two things that should never be related - one in the storage world and one in the network world. Fortunately, during my earlier lock-manager project I'd learned a thing or two about locking in the AIX kernel. In this case, the culprit was a global lock that was taken by many ioctl calls in both domains. Working around this had been pretty easy in the lock-manager case, because I was already in the kernel and could just call the unlocked ioctl function. (This is one of the things the AIX team had objected to, but had to accept because it was their goof and they had no cleaner alternatives to offer.) In this case, I wrote a little AIX kernel extension with the sole purpose of doing a similar lock bypass. Problem solved, we moved on.
The funniest part is that, a couple of months after I'd left Clam, I got a slightly panicked call from the person who had inherited that project. That tiny little kernel extension had become an essential part of their AIX products, and somehow they'd lost the source code to it. Did I by chance remember anything at all about what it did or how it worked? Fortunately, even though it had been at least a year since I had given it a single thought, I did remember and she was able to reconstruct it from the information I provided.
After ATF, we got even more requests for the same thing. I think Hitachi was one, or maybe it was Fujitsu. I know for sure that EMC was another. The thing that was interesting about these is that they weren't the same active/passive dual-controller architecture as the arrays for which REACT and ATF had been written. On these newer arrays all controllers could be active on the same LUN simultaneously, which made things much simpler. There was a lot of money to be made, which some of my Clam friends went on to do at another company (more about that later), but personally I didn't see a lot of learning to be done. I was still hungry for new challenges, and that's how I ended up at Dolphin in 1996.
Dolphin Interconnect Solutions was a pretty interesting place. One part of the company was the remnants of KSR (see my earlier post for more about my initial encounter with them), including about a dozen people in the US and a pile of intellectual property. Having been among the most senior people at Clam, I was the most junior on my new team by far. The other part of the company had itself been spun off from Norsk Data (which had very briefly been in discussions with Encore about something or other while I was there) in Oslo, Norway. Their specialty was Scalable Coherent Interconnect, a.k.a. IEEE-1596, which was a standard for a fully cache-coherent bus/interconnect. These two disparate parts were brought together by a VC company that specialized in combining American and Scandinavian tech assets in this way. Weird niche, but OK. The Oslo team had built a non-coherent PCI-SCI bridge (PSB) and was working on a cache-coherency chip (SCC) on top of it. Our team was responsible for using these building blocks to enable ever more tightly coupled kinds of clusters. Here's a quick overview of the system that I was working on.
- At the lowest level were communication links that ran at 1.6Gb/s initially, 4.0Gb/s by the time I left. This was way out in front of practically everyone else in 1996. Neither Ethernet nor Fibre Channel had hit even 1Gb/s yet. InfiniBand didn't exist yet, and probably would never have existed without SCI having gone before. I still have a Dolphin mug with the slogan "products for the gigabit generation" somewhere. It made sense at the time, but now it's more of a self-slam.
- Yes, I had a lot of trouble typing "SCI" after having typed "SCSI" so often before. Later, when I went back to working on SCSI and FC, I had the exact opposite problem. In fact, I just screwed up typing "SCI" in the last paragraph and that reminded me.
- Above the link layer was a topology layer. SCI was based on multiple rings (or sometimes "ringlets") combined to create more contention-free links than a star topology would have. This is where the "scalable" part came from.
- At the top level was supposed to be a cache coherency layer, which in our case would be handled by a separate chip. This was an early and extreme form of NUMA (Non Uniform Memory Access) or even COMA (Cache Only Memory Architecture) from the KSR side.
- Finally, there would be software to create a new kind of cluster in which some memory was shared but some wasn't, with a single system image across the whole thing.
That was the long-term vision, on which most of my colleagues worked. Meanwhile, even though the SCC wasn't ready we still had to sell something to stay afloat. What we already had could work as a fast alternative to Ethernet, if only it had the right drivers, and that became my job. I set about writing a DLPI driver for UnixWare (no idea why we chose that dog first) and an NDIS driver for Windows NT.
As part of all this, I got to work in Oslo for a couple of months, split across two visits. I really enjoyed my time there, walking and/or taking the excellent public transportation everywhere. I visited the Viking ship museum and the Fram museum and the Munch museum and Vigeland park in my spare time. I developed an unhealthy obsession with Bislett kebab and Troika candy bars. I called home to my wife, who asked why I was speaking with a Norwegian accent. I'm one of those people who picks up accents very easily, and as far as I was concerned most of the people around me were speaking English with an accent because speaking Norwegian to me would have been pointless, so I couldn't really help but start speaking the same way.
As fun as all that was, my friend Mike (from Clam) was telling me about something that sounded much more interesting than DLPI and NDIS drivers. Coming up next: Mango.