Navigating a Large Codebase
September 20, 2018The topic of how to learn a large codebase came up today on Twitter. Since I have a lot of experience with that, and am doing it right now for my new (to me) project at Facebook, I figure maybe I can provide some useful hints. Here goes.
First and foremost, whatever you learn as you're getting up to speed, write it down. Time after time, I've found that the most useful tips do not come from the people who originally wrote the code, or even from the most senior current developers. More often than not, they're from the very last person to walk the beginner's path before me. They have the same perspective and priorities as I do, all based on current code and still fresh in their minds. Once you're up to speed and start to specialize on certain things - and specialization is almost the defining of what makes a codebase "large" in the first place - you'll start to lose that too. Capture what newbies need to know while you still can.
Second, master your tools. On a small project, you can get away with just paging up and down in a plain text editor. That's not going to work on a large project. You're going to need some sort of faster navigation. At the very least, you need something like cscope that lets you launch a search for a highlighted string across a defined set of files, jump to one of those occurrences quickly, and then jump back just as quickly to resume your previous train of thought. The more keystrokes that takes, the worse it is for your concentration. It's even better if you can use a full IDE with a proper language specific cross-reference tool to search on function- or class-specific symbols instead of mere strings, but those have their own costs especially with very large repositories that are either slow to download and update locally or slow to access remotely. But don't just try to get by with basic vim and grep. Learn something better.
I often find that the best way to start learning code is to look at everything but the code. Start with some very basic boxes and lines to show the basic component breakdown, but don't get too hung up on the details. Then start looking at the other information that's not code. For years, I've started with things like RPC definitions and on-disk data structures. Learn the basics of what information the system keeps, which code produces it, which code consumes it, and how it's exchanged. The usually does mean some exploration of code, but for now focus should be on the data. Monitoring information can also be extremely valuable at this point. What information do the people actually running and debugging the system consider most important? Are there dashboards you can look at? One lesson more developers need to learn is that operators often understand the code better than you do in some ways, even if you wrote it. You only know how it was supposed to work; they know how it does work. So find some operators, or become one yourself, to get a feel for things.
Now, at last, it's time to dive into code. I like to start by picking some common kind of operation, like a read or write in a storage system, and just follow where it goes. Start with the conceptual level, from box to box in your top-level diagram. Then you can dive into the function calls or messages that move the control from one box to another, and finally through the various layers within a single box. A debugger can be a great exploratory tool here. Step through a few functions, then pause and look around at what data structures it's using, then step through a few more, etc. Then pick a different flow, moving from the common ones to the more exotic failure-recovery or background-processing ones.
Once you've figure out how something generally works, break it. Yes, I'm totally serious. Go make a change that seems at least semi-sensible. Maybe skip a step in a case where it seems unnecessary. Reorder steps. Find some decision point and make it use a slightly different calculation, or one based on slightly different information. Figure out what you need to look at to know whether the change was a good one or a bad one. Maybe even write a test for it. Then let 'er rip. Worst case, maybe you accidentally improve the code. Far more often, you get to learn how to debug code and figure out why it caused things to fail. What bad thing happened, or what good thing didn't happen? What data changed in which wrong way? You can learn a lot by answering such questions, and just as importantly get past the psychological "omigod I might break something" stage. Of course you might break something. Of course you will break something, and that's OK. The important thing is to get even better at unbreaking them, and the only way to do that is to practice.
At this point, you can probably start taking on some real challenges, making even more ambitious changes that you expect to be more permanent. Congratulations, you're now competent developer on whatever it was. Maybe in a few more years you'll even understand it well enough to consider designing its replacement. ;)