I'm working on ways to better convey the global structure of programs. The goal: use an open-source tool, get an idea for a simple tweak, fork the repo, orient yourself, and make the change you visualized -- all in a single afternoon. Understanding a strange codebase is hard; I can't change that, but I think I can make it easier for people to persevere. I want to carve steps into the wall of the learning process. I want to replace quantum leaps of understanding after weeks of effort with an hour of achievement for an honest hour (or three) of effort.
This focus on helping outsiders comprehend a project is unconventional. I'm less concerned about the readability of a codebase. I find the usual rhetoric around ‘readability’ tends to focus on helping authors merge contributions rather than helping others understand and appreciate their efforts. If you've ever seen an open source project whose CONTRIBUTING document consists of a nit-picky list of formatting rules and procedures for submitting patches, you know what I mean. There's a paucity of guidance earlier in the pipeline, when newcomers aren't thinking about sending a patch, just trying to understand the sea of abstractions, to keep their heads above water. I think improving this guidance might greatly increase the amount of citizen involvement in open source, the number of eyeballs reviewing code, rather than simply using projects and treating their internals as externalities until the next serious security vulnerability. Our society is more anti-fragile when there's greater grassroots oversight of the software that is eating our world.
Everyone doesn't have to understand every line of code that helps manage their lives, but all software should reward curiosity.
Changing society seems hard. Where to begin? One hint is the observation that early versions of most software are often surprisingly easy to understand. There's this pervasive tendency for software to accumulate accidental complexity over time, making it harder to understand, and also more brittle and ossified and harder to change. If we could resist this ossification we'd make strides in keeping the global structure of a program accessible. Such creeping accidental complexity has at least three causes:
A. Backwards compatibility considerations. Early mistakes in the design of an interface are often perpetuated indefinitely. Supporting them takes code. Projects that add many new features also accumulate many missteps. Over time the weight of these past adaptations starts to prevent future adaptation.
B. Churn in personnel. If a project lasts long enough early contributors eventually leave and are replaced by new ones. The new ones have holes in their knowledge of the codebase, all the different facilities provided, the reasons why design decisions were made just so. Peter Naur pointed out back in 1985 the odd fact that that no matter how much documentation we write, we can't seem to help newcomers understand our programs without talking to the original authors. In-person interactive conversations tend to be a precious resource; there's only so many of them newcomers can have before they need to start contributing to a project, and there's only so much bandwidth the old hands have to review changes for unnecessary complexity or over-engineering. Personnel churn is a lossy process; every generation of programmers on a project tends to know less about it, and to be less in control of it.
C. Vestigial features. Even after accounting for compatibility considerations, projects past a certain age often have features that can be removed. However, such simplification rarely happens because of the risk of regressions. We forget precisely why we did what we did, and that forces us to choose between reintroducing regressions or continuing to cargo-cult old solutions long after they've become unnecessary.
There may be other phenomena I haven't considered, but these three suffice to illuminate a crucial point: they're independent of most technology choices we tend to think about. A new language or tool will at best have a short-term effect unless we're able to keep the big picture of a codebase comprehensible over time as members join and leave it.
Constraints on a solution
In direct correspondence to the above list, I've nailed down the following primary design invariants:
A. Minimize compatibility constraints. We can't avoid users creating habits and muscle memory in the tools they use, but things are different when we're building libraries for other programmers. In that situation the user of our creation is another programmer with the ability to empathize with our situation if given the right context, and without our API deeply embedded in muscle memory.
B. Be friendly to outsiders, because they will be the insiders of tomorrow. Many best practices we teach programmers today help insiders manage a project but hinder understanding in newcomers. In a strange new project straight-line code is usually easier to follow than lots of indirection and abstractions. Comments are of limited value because most comments explain local features, but fail to put them in a global context. Build systems that automate a lot of work in our specialized industrial-strength setup turn out to be brittle on someone's laptop when running for the first time.
C. Be rewrite-friendly. Rewrites are traditionally considered a bad idea, but the question shouldn't be whether they're a good or bad idea. Software is the most malleable medium known to man, and we should do all we can to embrace that essential malleability that is its primary characteristic and boon. If rewriting is stressful and error-prone, perhaps we're doing software all wrong in the first place.
These can seem like difficult constraints to solve independently let alone simultaneously, but it seems likely that conveying global structure lies on the road to them all. If a project is to remain easy to understand over the long term it can't afford to have historical baggage; it is by definition friendly to outsiders; and it has to have been constantly getting rewritten.
Having carved out these constraints, everything else is open to question. I'm willing to try global variables, large functions, even goto statements if I can trade comprehension in the large for readability in the small. Our instincts on the right way to manage complexity have failed to yield dividends, so it's time to go against our instincts.
My research within these constraints happens in the Mu project. Mu's current goal is a computer that allows anyone to audit its inner workings. The hypothesis is that being able to easily answer questions about a program will encourage us to ask more questions, in much the way usage of an encyclopaedia breeds more usage. Rewarding curiosity will stimulate curiosity, leading to a virtuous cycle where an order of magnitude more people grow to understand how their computer works as they use it. Contrast such a virtuous cycle with the vicious cycle we have today, where compatibility guarantees breed complexity, which requires new abstractions to manage, which in turn lead to proliferating compatibility guarantees.
Mu is designed around two core concepts: traces and fake hardware.
- Traces answer the question, "what did my computer just do?" They present answers in terms of facts deduced, which in turn require an audit trail built up out of finer-grained facts. Any operation should be decomposable into a parsimonious combination of strictly simpler operations, terminating without cyclic dependencies or circular reasoning at some ground level.
- Fake hardware allows Mu computers to test I/O, which permits the encoding of arbitrary desires, which in turn permits answers to the question, "what happens if I make this change?" Just make the change and rerun all tests. Since tests use fake hardware, you can be confident no change will have destructive consequences until you've had a chance to explore its implications.
Building traces and fake hardware deep into the stack allows them to be used together with compounding benefits. Beyond avoiding regressions, tests become a store of stories about a computer, with the trace for each test setting a scene and exploring a set of implications. Drilling down from facts into a trace to the subsidiary facts that led to their deduction helps debug issues. Tests can inspect the trace as well (white-box tests), to make more precise and more comprehensive assertions than merely inspecting outputs (black-box tests) would allow.
I also hypothesize that building these mechanisms deep into the stack from day 1 will radically impact the culture of an eco-system in a way that no bolted-on tool or service at higher levels can replicate. It would be easier to be confident that an app is free from regression if all automated tests pass. This would make the stack easy to rewrite and simplify by dropping features, without fear that a subset of targeted apps might break. As a result people might fork projects more easily, adding and particularly deleting unused features, exchanging code between disparate forks (copy the tests over, then try copying code over and making tests pass, rewriting and polishing where necessary). The community would have in effect a diversified portfolio of forks, a “wavefront” of possible combinations of features and alternative implementations of features instead of the single trunk with monotonically growing complexity that we get today. Application writers who wrote thorough tests for their apps (something they just can’t do today) would be able to bounce around between forks more easily without getting locked in to a single one as currently happens.
If these techniques are so great, why haven't they been tried before? I think it's because of a ‘drawback’ they all share: they're too powerful, and it's easy to shoot yourself in the foot. It takes taste to trace just domain-independent facts and not implementation details, to make white-box tests robust to radical changes. But I think requiring taste is a good thing. A prime reason we forget crucial details about codebases is that we create rules and processes around them, and those who follow us have no reason to remember the original reasons for the rules and processes. If every generation were to be allowed to make its own mistakes, the reasons would stay ‘in the air’ and not get lost. Our software would better reward curiosity.
Last updated May 21, 2021