Sep 6, 2015
Teaching taste

You can know the name of a bird in all the languages of the world, but when you're finished, you'll know absolutely nothing whatever about the bird."Richard Feynman

I've been teaching programming using my new UI for just about six weeks now, and it's been miraculous and endlessly fascinating to watch a young mind up close. I'm still processing many of my observations and lessons, but I want to describe how one of my students learned something I never meant to teach: consistent indentation.

The only thing I did to help him learn indentation was this: I never ever brought it up. Mostly because I just don't consider it very important, especially at such an early stage. When I wrote programs to demonstrate features I indented as I normally would, then I'd hand over the keyboard, and ignore the fact that my student was abutting each line of code with the left margin.

Read more →


* *
Jul 21, 2015
An experimental UI for teaching programming

Six months ago I fell into a little gig to teach two students programming, and it's been an eye-opening experience. Where I was earlier focused on conveying codebases to programmers, I've lately been thinking a lot harder about conveying programming to non-programmers. I think the two might be special-cases of a grand unifying theory of software representation, but that's another story. For now I don't have a grand unifying theory. What I have is a screenshot:

Let me describe the tool and the problems that it tries to address. When I started out teaching I picked an existing platform that I'd always liked. But quickly I started noticing many limitations. Today's platforms are great for experienced programmers and one can do great things with them, but they are very alien to noobs, who suddenly have to learn many different things at once.

Read more →


* *
Oct 3, 2014
Literate programming: Knuth is doing it wrong

Literate programming advocates this: Order your code for others to read, not for the compiler. Beautifully typeset your code so one can curl up in bed to read it like a novel. Keep documentation in sync with code. What's not to like about this vision? I have two beefs with it: the ends are insufficiently ambitious by focusing on a passive representation; and the means were insufficiently polished, by over-emphasizing typesetting at the cost of prose quality. Elaboration, in reverse order:

Read more →


* *
Sep 9, 2014
Consensual hells: Geopolitics for individuals

My third guest post at


* *
Apr 9, 2014
Consensual Hells: The legibility tradeoff

My second guest post has just been published over at, on the subject of building capture-resistant organizations. Tl;dr - constrain individual discretion at the top, and empower it on the fringes. But the direct ways to do that won't work. What will? Read on →


* *
Feb 12, 2014
Consensual Hells: From cognitive biases to institutional decay

I've just published the first of a series of guests posts over at Check it out.


* *
Oct 7, 2013
A new way to organize programs

If you're a programmer this has happened to you. You've built or known a project that starts out with a well-defined purpose and a straightforward mapping from purpose to structure in a few sub-systems.
But it can't stand still; changes continue to roll in.
Some of the changes touch multiple sub-systems.
Each change is coherent in the mind of its maker. But once it's made, it diffuses into the system as a whole.
Soon the once-elegant design has turned into a patchwork-quilt of changes. Requirements shift, parts get repurposed. Connections proliferate.
Veterans who watched this evolution can keep it mostly straight in their heads. But newcomers see only the patchwork quilt. They can't tell where one feature begins and another ends.

What if features could stay separate as they're added, so that newcomers could see a cleaned-up history of the process?

Solution Read more →


* *
Oct 6, 2013
How I describe my hobby to programmers

(I still haven't found a way to describe it to non-programmers.)

Wart is an experimental, dynamic, batshit-liberal language designed for small teams of hobbyists writing potentially long-lived software. The primary goal is to be easy for anyone to understand and modify.

$ git clone
$ git checkout c73dcd8d6  # state when I wrote this article
$ cd wart
$ ./wart      # you'll need gcc and unix
ready! type in an expression, then hit enter twice. ctrl-d exits.
⇒ 2

Read more →


* *
Aug 13, 2013
The trouble with 'readability'

We programmers love to talk about the value of readability in software. But all our rhetoric, even if it were practiced with diligence, suffers from a giant blind spot.

Exhibit A

Here's Douglas Crockford on programming style. For the first half he explains why readability is important: because our brains do far more subconsciously than we tend to realize. The anecdotes are interesting and the presentation is engaging, but the premise is basically preaching to the choir. Of course I want my code to be readable. Sensei, show me how!

But when he gets to ‘how’, here is what we get: good names, comments and consistent indentation. Wait, what?! After all that discussion about how complex programs are, and how hard to understand, do we really expect to make a dent on global complexity with a few blunt, local rules? Does something not seem off?

Exhibit B

Here's a paean to the software quality of Doom 3. It starts out with this utterly promising ideal:

Local code should explain, or at least hint at, the overall system design.

Unfortunately we never hear about the 'overall system design' ever again. Instead we get.. good names, comments and indentation, culminating in the author's ideal of beauty:

The two biggest things, for me at least, are stylistic indenting and maximum const-ness.

I think the fundamental reasons for the quality of Doom 3 have been missed. Observing superficial small-scale features will only take you so far in appreciating the large-scale beauty of a program.

Exhibit C

Kernighan and Pike's classic Practice of Programming takes mostly the code writer's part. For reading you're left again with guidelines in the small: names, comments and indentation.


I could go on and on. Everytime the discussion turns to readability we skip almost unconsciously to style guides and whatnot. Local rules for a fundamentally global problem.

This blind spot is baked into the very phrase ‘readable code’. ‘Code’ isn't an amorphous thing that you manage by the pound. You can't make software clean simply by making all the ‘code’ in it more clean. What we really ought to be thinking about is readable programs. Functions aren't readable in isolation, at least not in the most important way. The biggest aid to a function's readability is to convey where it fits in the larger program.

Nowhere is this more apparent than with names. All the above articles and more emphasize the value of names. But they all focus on naming conventions and rules of thumb to evaluate the quality of a single name in isolation. In practice, a series of locally well-chosen names gradually end up in overall cacophony. A program with a small harmonious vocabulary of names consistently used is hugely effective regardless of whether its types and variables are visibly distinguished. To put it another way, the readability of a program can be hugely enhanced by the names it doesn't use.

Part of the problem is that talking about local features is easy. It's easy to teach the difference between a good name and a bad name. Once taught, the reader has the satisfaction of going off to judge names all around him. It's far harder to show a globally coherent design without losing the reader.

Simple superficial rules can be applied to any program, but to learn from a well-designed program we must focus on what's unique to it, and not easily transferred to other programs in disparate domains. That again increases the odds of losing the reader.

But the largest problem is that we programmers are often looking at the world from a parochial perspective: “I built this awesome program, I understand everything about it, and people keep bugging me to accept their crappy changes.” Style guides and conventions are basically tools for the insiders of a software project. If you already understand the global picture it makes sense to focus on the local irritations. But you won't be around forever. Eventually one of these newcomers will take charge of this project, and they'll make a mess if you didn't talk to them enough about the big picture.

Lots of thought has gone into the small-scale best practices to help maintainers merge changes from others, but there's been no attempt at learning to communicate large-scale organization to newcomers. Perhaps this is a different skill entirely; if so, it needs a different name than ‘readability’.


* *
Jun 9, 2013
A new way of testing

There's a combinatorial explosion at the heart of writing tests: the more coarse-grained the test, the more possible code paths to test, and the harder it gets to cover every corner case. In response, conventional wisdom is to test behavior at as fine a granularity as possible. The customary divide between 'unit' and 'integration' tests exists for this reason. Integration tests operate on the external interface to a program, while unit tests directly invoke different sub-components.

But such fine-grained tests have a limitation: they make it harder to move function boundaries around, whether it's splitting a helper out of its original call-site, or coalescing a helper function into its caller. Such transformations quickly outgrow the build/refactor partition that is at the heart of modern test-based development; you end up either creating functions without tests, or throwing away tests for functions that don't exist anymore, or manually stitching tests to a new call-site. All these operations are error-prone and stress-inducing. Does this function need to be test-driven from scratch? Am I losing something valuable in those obsolete tests? In practice, the emphasis on alternating phases of building (writing tests) and refactoring (holding tests unchanged) causes certain kinds of global reorganization to never happen. In the face of gradually shifting requirements and emphasis, codebases sink deeper and deeper into a locally optimum architecture that often has more to do with historical reasons than thoughtful design.

I've been experimenting with a new approach to keep the organization of code more fluid, and to keep tests from ossifying it. Rather than pass in specific inputs and make assertions on the outputs, I modify code to judiciously print to a trace and make assertions on the trace at the end of a run. As a result, tests no longer need call fine-grained helpers directly.

An utterly contrived and simplistic code example and test:

int foo() { return 34; }
void test_foo() { check(foo() == 34); }

With traces, I would write this as:

int foo() {
  trace << "foo: 34";
  return 34;
void test_foo() {
  check_trace_contents("foo: 34");

The call to trace is conceptually just a print or logging statement. And the call to check_trace_contents ensures that the 'log' for the test contains a specific line of text:

foo: 34

That's the basic flow: create side-effects to check for rather than checking return values directly. At this point it probably seems utterly redundant. Here's a more realistic example, this time from my toy lisp interpreter. Before:

void test_eval_handles_body_keyword_synonym() {
  run("f <- (fn (a b ... body|do) body)");
  cell* result = eval("(f 2 :do 1 3)");
  // result should be (1 3)
  check(car(result) == new_num(1));
  check(car(cdr(result)) == new_num(3));


void test_eval_handles_body_keyword_synonym() {
  run("f <- (fn (a b ... body|do) body)");
  run("(f 2 :do 1 3)");
  check_trace_contents("(1 3)");

(The code looks like this.)

This example shows the key benefit of this approach. Instead of calling eval directly, we're now calling the top-level run function. Since we only care about a side-effect we don't need access to the value returned by eval. If we refactored eval in the future we wouldn't need to change this function at all. We'd just need to ensure that we preserved the tracing to emit the result of evaluation somewhere in the program.

As I've gone through and 'tracified' all my tests, they've taken on a common structure: first I run some preconditions. Then I run the expression I want to test and inspect the trace. Sometimes I'm checking for something that the setup expressions could have emitted and need to clear the trace to avoid contamination. Over time different parts of the program get namespaced with labels to avoid accidental conflict.

check_trace_contents("eval", "=> (1 3)");

This call now says, "look for this line only among lines in the trace tagged with the label eval." Other tests may run the same code but test other aspects of it, such as tokenization, or parsing. Labels allow me to verify behavior of different subsystems in an arbitrarily fine-grained manner without needing to know how to invoke them.

Other codebases will have a different common structure. They may call a different top-level than run, and may pass in inputs differently. But they'll all need labels to isolate design concerns.

The payoff of these changes: all my tests are now oblivious to internal details like tokenization, parsing and evaluation. The trace checks that the program correctly computed a specific fact, while remaining oblivious about how it was computed, whether synchronously or asynchronously, serially or in parallel, whether it was returned in a callback or a global, etc. The hypothesis is that this will make high-level reorganizations easier in future, and therefore more likely to occur.


As I program in this style, I've been keeping a list of anxieties, potentially-fatal objections to it:

  • Are the new tests more brittle? I've had a couple of spurious failures from subtly different whitespace, but they haven't taken long to diagnose. I've also been gradually growing a vocabulary of possible checks on the trace. Even though it's conceptually like logging, the trace doesn't have to be stored in a file on disk. It's a random-access in-memory structure that can be sliced and diced in various ways. I've already switched implementations a couple of times as I added labels to namespace different subsystems/concerns, and a notion of frames for distinguishing recursive calls.

  • Are we testing what we think we're testing? The trace adds a level of indirection, and it takes a little care to avoid false successes. So far it hasn't been more effort than conventional tests.

  • Will they lead to crappier architecture? Arguably the biggest benefit of TDD is that it makes functions more testable all across a large program. Tracing makes it possible to keep such interfaces crappier and more tangled. On the other hand, the complexities of flow control, concurrency and error management often cause interface complexity anyway. My weak sense so far is that tests are like training wheels for inexperienced designers. After some experience, I hope people will continue to design tasteful interfaces even if they aren't forced to do so by their tests.

  • Am I just reinventing mocks? I hope not, because I hate mocks. The big difference to my mind is that traces should output and verify domain-specific knowledge rather than implementation details, and that it's more convenient with traces to selectively check specific states in specific tests, without requiring a lot of setup in each test. Indeed, one way to view this whole approach is as test-specific assertions that can be easily turned on and off from one test to the next.

  • Avoiding side-effects is arguably the most valuable rule we know about good design. Could this whole approach be a dead-end simply because of its extreme use of side-effects? Arguably these side-effects are ok, because they don't break referential transparency. The trace is purely part of the test harness, something the program can be oblivious to in production runs.

The future

I'm going to monitor those worries, but I feel very optimistic about this idea. Traces could enable tests that have so far been hard to create: for performance, fault-tolerance, synchronization, and so on. Traces could be a unifying source of knowledge about a codebase. I've been experimenting with a collapsing interface for rendering traces that would help newcomers visualize a new codebase, or veterans more quickly connect errors to causes. More on these ideas anon.


* *
Making the big picture easy to see, in software and in society at large.
Code (contributions)
Prose (shorter; favorites)
favorite insights
Social Dynamics
Social Software