Nov 26, 2012
Software libraries don't have to suck
When I said that libraries suck, I wasn't being
precise.1 Libraries do lots of things
well. They allow programmers to quickly prototype new ideas. They allow names
to have multiple meanings based on context. They speed up incremental
recompiles, they allow programs on a system to share code pages in RAM. Back
in the desktop era, they were even units of commerce. All this is good.
What's not good is the expectation they all-too-frequently set with their
users: go ahead, use me in production without understanding me. This
expectation has ill-effects for both producers and consumers. Authors of
libraries prematurely freeze their interfaces in a futile effort to spare
their consumers inconvenience. Consumers of libraries have gotten trained to
think that they can outsource parts of their craft to others, and that waiting
for 'upstream' to fill some gap is better than hacking a solution yourself and
risking a fork. Both of these are bad ideas.
To library authors
Interfaces aren't made in one big-bang moment. They evolve. You write code for
one use case. Then maybe you find it works in another, and another. This
organic process requires a lengthy gestation period.2
When we try to shortcut it, we end up with heavily-used interfaces that will
never be fixed, even though everyone knows they are bad.
A prematurely frozen library doesn't just force people to live with it. People
react to it by wrapping it in a cleaner interface. But then they prematurely
freeze the new interface, and it starts accumulating warts and bolt-on
features just like the old one. Now you have two
interfaces. Was forking the existing interface really so much worse an
alternative? How much smaller might each codebase in the world be without all
the combinatorial explosion of interfaces wrapping other interfaces?
Just admit up-front that upgrades are non-trivial. This will help you maintain
a sense of ownership for your interfaces, and make you more willing to
gradually do away with the bad ideas.
More changes to the interface will put more pressure on your development
process. Embrace that pressure. Help users engage with the development
process. Focus on making it easier for users to learn about the
implementation, the process of filing bugs.
Often the hardest part of filing a bug for your users is figuring out where to
file it. What part of the stack is broken? No amount of black-box architecture
astronomy will fix this problem for them. The only solution is to help them
understand their system, at least in broad strokes. Start with your library.
Encourage users to fork you. "I'm not sure this is a good idea; why don't we
create a fork as an A/B test?" is much more welcoming than "Your pull request
was rejected." Publicize your forks, tell people about them, watch the
conversation around them. They might change your mind.
Watch out for the warm fuzzies triggered by the word 'reuse'. A world of reuse
is a world of promiscuity, with pieces of code connecting up wantonly with
each other. Division of labor is a relationship not to be gotten into lightly.
It requires knowing what guarantees you need, and what guarantees the
counterparty provides. And you can't know what guarantees you need from a subsystem you don't understand.
There's a prisoner's dilemma
here: libraries that over-promise will seem to get popular faster. But hold
firm; these fashions are short-term. Build something that people will use long
after Cucumber has been replaced with Zucchini.
To library users
Expect less. Know what libraries you rely on most, and take ownership for
them. Take the trouble to understand how they work. Start pushing on their
authors to make them easier to understand. Be more willing to hack on
libraries to solve your own problems, even if it risks creating forks. If your
solutions are not easily accepted upstream, don't be afraid to publish them
yourselves. Just set expectations appropriately. If a library is too much
trouble to understand, seek alternatives. Things you don't understand are the
source of all technical debt. Try to build your own, for just the
use-cases you care about. You might end up with something much simpler to
maintain, something that fits better in your head.
notes
1. And trying to distinguish between
'abstraction' and 'service' turned out to obfuscate more than it clarified, so
I'm going to avoid those words.
2. Perhaps we need a different name for
immature libraries (which are now the vast majority of all libraries). That
allows users to set expectations about the level of churn in the interface,
and frees up library writers to correct earlier missteps. Not enough people
leave time for gestating interfaces, perhaps in analogy with how
not enough people leave enough time for debugging.
permalink
* *
Nov 24, 2012
Comments in code: the more we write, the less we want to highlight
That's my immediate reaction watching these programmers argue
about what color their comments should be when reading code. It seems those
who write sparse comments want them to pop out of the screen, and those who
comment more heavily like to provide a background hum of human commentary
that's useful to read in certain contexts and otherwise easy to filter out.
Now that I think about it, this matches my experience. I've experienced good
codebases commented both sparsely and heavily. The longer I spend with a
sparsely-commented codebase, the more I cling to the comments it does have.
They act as landmarks, concise reminders of invariants. However, as I grow
familiar with a heavily-commented codebase I tend to skip past the comments.
Code is non-linear and can be read in lots of ways, with lots of different
questions in mind. Inevitably, narrative comments only answer some of those
questions and are a drag the rest of the time.
I'm reminded of one of Lincoln's famous quotes,
a fore-shadowing of the CAP theorem.
Comments can be either detailed or salient, never both.
Comments are versatile. Perhaps we need two kinds of comments that can be
colored differently. Are there still other uses for them?
permalink
* *
Nov 12, 2012
Software libraries suck
Here's why, in a sentence: they promise to be abstractions, but they
end up becoming services. An abstraction frees you from thinking
about its internals every time you use it. A service allows you to never learn
its internals. A service is not an abstraction. It isn't 'abstracting' away
the details. Somebody else is thinking about the details so you can remain
ignorant.
Programmers manage abstraction boundaries, that's our stock in trade. Managing
them requires bouncing around on both sides of them. If you restrict yourself
to one side of an abstraction, you're limiting your growth as a programmer.1
You're chopping off your strength and potential, one lock of hair at a time,
and sacrificing it on the altar of convenience.
A library you're ignorant of is a risk you're exposed to, a now-quiet frontier
that may suddenly face assault from some bug when you're on a deadline and can
least afford the distraction. Better to take a day or week now, when things
are quiet, to save that hour of life-shortening stress when it really
matters.
You don't have to give up the libraries you currently rely on. You just have
to take the effort to enumerate them, locate them on your system, install the
sources if necessary, and take ownership the next time your program dies
within them, or uncovers a bug in them. Are these activities more
time-consuming than not doing them? Of course. Consider them a long-term
investment.
Just enumerating all the libraries you rely on others to provide can be
eye-opening. Tot up all the open bugs in their trackers and you have a sense
of your exposure to risks outside your control. In fact, forget the whole
system. Just start with your Gemfile
or npm_modules.
They're probably lowest-maturity and therefore highest risk.
Once you assess the amount of effort that should be going into each library
you use, you may well wonder if all those libraries are worth the effort. And
that's a useful insight as well. “Achievement unlocked: I've stopped
adding dependencies willy-nilly.”
Update: Check out the
sequel. Particularly
if this post left you scratching your head about what I could possibly be
going on about.
(This birth was midwifed by conversations with Ross Angle, Dan Grover, and Manuel
Simoni.)
notes
1. If you don't identify as a programmer, if that
isn't your core strength, if you just program now and then because it's
expedient, then treating libraries as services may make more sense. If a major
issue pops up you'll need to find more expert help, but you knew that already.
permalink
* *
Aug 1, 2012
Marx's engines of plenty
From "Red Plenty" by Francis Spufford:
The problem was that Marx had predicted the wrong revolution. He had said that
socialism would come, not in backward agricultural Russia, but in the most
developed and advanced industrial countries. Capitalism (he'd argued) created
misery, but it also created progress, and the revolution that was going to
liberate mankind from misery would only happen once capitalism had contributed
all the progress that it could, and all the misery too. At that point the
infrastructure for producing things would have attained a state of
near-perfection. At the same time, the search for higher profits would have
driven the wages of the working class down to near-destitution. It would be a
world of wonderful machines and ragged humans. When the contradiction became
unbearable, the workers would act. And paradise would quickly lie within their
grasp, because Marx expected that the victorious socialists of the future
would be able to pick up the whole completed apparatus of capitalism and carry
it forward into the new society, still humming, still prodigally producing.
There might be a need for a brief period of decisive government during the
transition to the new world of plenty, but the 'dictatorship of the
proletariat' Marx imagined was modelled on the 'dictatorships' of ancient
Rome, when the republic would now and again draft some respected citizen to
give orders in an emergency. The dictatorship of Cincinnatus lasted one day;
then, having extracted the Roman army from the mess it was in, he went back to
his plough. The dictatorship of the proletariat would presumably last a little
longer, perhaps a few years. And of course there would also be an opportunity
to improve on the sleek technology inherited from capitalism, now that society
as a whole was pulling the levers of the engines of plenty. But it wouldn't
take long. There'd be no need to build up productive capacity for the new
world. Capitalism had already done that. Very soon it would no longer be
necessary even to share out the rewards of work in proportion to how much work
people did. All the 'springs of co-operative wealth' would flow abundantly,
and anyone could have anything, or be anything. No wonder that Marx's pictures
of the society to come were so vague: it was going to be an idyll, a rather
soft-focus gentlemanly idyll, in which the inherited production lines whirring
away in the background allowed the humans in the foreground to play, 'to hunt
in the morning, fish in the afternoon, rear cattle in the evening, criticise
after dinner, just as I have a mind…'
None of this was of the slightest use to the Marxists trying to run the
economy of Russia after 1917. Not only had capitalist development not reached
its climax of perfection and desperation in Russia; it had barely even begun.
Russia had fewer railroads, fewer roads and less electricity than any other
European power. Within living memory, the large majority of the population had
been slaves. It became inescapably clear that, in Russia, socialism was going
to have to do what Marx had never expected, and to carry out the task of
development he'd seen as belonging strictly to capitalism. Socialism would
have to mimic capitalism's ability to run an industrial revolution, to marshal
investment, to build modern life.
But how?
There was in fact an international debate in the 1920s, partly prompted by the
Bolsheviks' strange situation, over whether a state-run economy could really
find substitutes for all of capitalism's working parts. No, said the Austrian
economist Ludwig von Mises, it could not: in particular, it couldn't replace
markets, and the market prices that made it possible to tell whether it was
advantageous to produce any particular thing. Yes, it could, replied a
gradually expanding group of socialist economists. A market was only a
mathematical device for allocating goods to the highest bidder, and so a
socialist state could easily equip itself with a replica marketplace, reduced
entirely to maths. For a long time, the 'market socialists' were judged to
have won the argument. The Bolsheviks, however, paid very little attention.
Marx had not thought markets were very important — as far as he was
concerned market prices just reflected the labour that had gone into products,
plus some meaningless statistical fuzz. And the Bolsheviks were mining Marx's
analysis of capitalism for hints to follow. They were not assembling an
elegant mathematical version of capitalism. They were building a brutish,
pragmatic simulacrum of what Marx and Engels had seen in the boom towns of the
mid-nineteenth century, in Manchester when its sky was dark at noon with coal
smoke. And they didn't easily do debate, either. In their hands, Marx's
temporary Roman-style dictatorship had become permanent rule by the Party
itself, never to be challenged, never to be questioned. There had been
supposed to be a space preserved inside the Party for experiment and
policy-making, but the police methods used on the rest of Russian society
crept inexorably inward. The space for safe talk shrank till, with Stalin's
victory over the last of his rivals, it closed altogether, and the apparatus
of votes, committee reports and 'discussion journals' became purely
ceremonious, a kind of fetish of departed civilisation.
Until 1928, the Soviet Union was a mixed economy. Industry was in the hands of
the state but tailors' shops and private cafes were still open, and farms
still belonged to the peasant families who'd received them when the Bolsheviks
broke up the great estates. Investment for industry, therefore, had to come
the slow way, by taxing the farmers; meanwhile the farmers' incomes made them
dangerously independent, and food prices bounced disconcertingly up and down.
Collectivisation saw to all these problems at once. It killed several
million more people in the short term, and permanently dislocated the Soviet
food supply; but forcing the whole country population into collective farms
let the central government set the purchase price paid for crops, and so let
it take as large a surplus for investment as it liked. In effect, all but a
fraction of the proceeds of farming became suddenly available for industry.
Between them, these policies created a society that was utterly hierarchical.
Metaphysically speaking, Russian workers owned the entire economy, with the
Party acting as their proxy. But in practice, from 8.30 a.m. on Monday morning
until 6 p.m. on Saturday night, they were expected simply to obey. At the very
bottom of the heap came the prisoner-labourers of the Gulag. Stalin appears to
have believed that, since according to Marx all value was created by labour,
slave labour was a tremendous bargain. You got all that value, all that Arctic
nickel mined and timber cut and rail track laid, for no wages, just a little
millet soup. Then came the collective farmers, in theory free, effectively
returned to the serfdom of their grandfathers. A decisive step above them, in
turn, came the swelling army of factory workers, almost all recent escapees or
refugees from the land. It was not an easy existence. Discipline at work was
enforced through the criminal code. Arrive late three times in a row, and you
were a 'saboteur'. Sentence: ten years. But from the factory workers on up,
this was also a society in a state of very high mobility, with fairytale-rapid
rises. You could start a semi-literate rural apparatchik, be the mayor of a
city at twenty-five, a minister of the state at thirty; and then, if you were
unlucky or maladroit, a corpse at thirty-two, or maybe a prisoner in the
nickel mines, having slid from the top of the Soviet ladder right back down
its longest snake. But mishaps apart, life was pretty good at the top, with a
dacha in the country, from whose verandah the favoured citizen could survey
the new world growing down below.
And it did grow. Market economies, so far as they were 'designed' at all, were
designed to match buyers and sellers. They grew, but only because the sellers
might decide, from the eagerness of the buyers, to make a little more of what
they were selling. Growth wasn't intrinsic. The planned economy, on the other
hand, was explicitly and deliberately a ratchet, designed to effect a one-way
passage from scarcity to plenty by stepping up output each year, every year,
year after year. Nothing else mattered: not profit, not accidents, not the
effect of the factories on the land or the air. The planned economy measured
its success in terms of the amount of physical things it produced.
Money was treated as secondary, merely a tool for accounting. Indeed, there
was a philosophical issue involved here, a point on which it was important for
Soviet planners to feel that they were keeping faith with Marx, even if in
almost every other respect their post-revolutionary world parted company with
his. Theirs was a system that generated use-values rather than
exchange-values, tangible human benefits rather than the marketplace delusion
of value turned independent and imperious. For a society to produce less than
it could, because people could not 'afford' the extra production, was
ridiculous. Instead of calculating Gross Domestic Product, the sum of all
incomes earned, the USSR calculated Net Material Product, the country's total
output of stuff — expressed, for convenience, in roubles.
This made it difficult to compare Soviet growth with growth elsewhere. After
the Second World War, when the numbers coming out of the Soviet Union started
to become more and more worryingly radiant, it became a major preoccupation of
the newly-formed CIA to try to translate the official Soviet figures from NMP
to GDP, discounting for propaganda, guessing at suitable weighting for the
value of products in the Soviet environment, subtracting items
'double-counted' in the NMP, like the steel that appeared there once in its
naked new-forged self, twice when panel-beaten into an automobile. The CIA
figures were always lower than the glowing stats from Moscow. Yet they were
still worrying enough to cause heart-searching among Western governments, and
anxious editorialising in Western newspapers. For a while, in the late 1950s
and the early 1960s, people in the West felt the same mesmerising disquiet
over Soviet growth they were going to feel for Japanese growth in the 1970s
and 1980s, and for Chinese and Indian growth from the 1990s on. Nor were they
being deceived. Beneath several layers of varnish, the phenomenon was real.
Since the fall of the Soviet Union, historians from both Russia and the West
have recalculated the Soviet growth record one more time: and even using the
most pessimistic of these newest estimates, all lower again than the Kremlin's
numbers and the CIA's, the Soviet Union still shows up as growing faster than
any country in the world except Japan. Officially it grew 10.1% a year;
according to the CIA it grew 7% a year; now the estimates range upward from 5%
a year. Still enough to squeak past West Germany, and to cruise past the US
average of around 3.3%.
On the strength of this performance, Stalin's successors set about civilising
their savage growth machine. Most of the prisoners were released from the
labour camps. Collective farmers were allowed to earn incomes visible without
a microscope, and eventually given old-age pensions. Workers' wages were
raised, and the salaries of the elite were capped, creating a much more
egalitarian spread of income. The stick of terror driving managers was
discarded too; reporting a bad year's growth now meant only a lousy bonus. The
work day shrank to eight hours, the work week to five days. The millions of
families squeezed into juddering tsarist tenements were finally housed in
brand-new suburbs. It was clear that another wave of investment was going to
be needed, bigger if anything than the one before, to build the next
generation of industries: plastics, artificial fibers, the just-emerging
technologies of information. But it all seemed to be affordable now. The
Soviet Union could give its populace some jam today, and reinvest for
tomorrow, and pay the weapons bill of a super power, all at once. The Party
could even afford to experiment with a little gingerly discussion; a little
closely-monitored blowing of the dust off the abandoned mechanisms for talking
about aims and objectives, priorities and possibilities.
And this was fortunate, because as it happened the board of USSR Inc. was in
need of some expert advice. The growth figures were marvellous, amazing,
outstanding — but there was something faintly disturbing about them,
even in their rosiest versions. For each extra unit of output it gained, the
Soviet Union was far more dependent than other countries on throwing in extra
inputs: extra labour, extra raw materials, extra investment. This kind of
'extensive' growth (as opposed to the 'intensive' growth of rising
productivity) came with built-in limits, and the Soviet economy was already
nearing them. Whisper it quietly, but the capital productivity of the USSR was
a disgrace. With a government that could choose what money meant, the Soviet
Union already got less return for its investments than any of its capitalist
rivals. Between 1950 and 1960, for instance, it had sunk 9.4% of extra capital
a year into the economy, to earn only 5.8% a year more actual production. In
effect, they were spraying Soviet industry with they money they had so
painfully extracted from the populace, wasting more than a third of it in the
process.
permalink
* *
Jul 4, 2012
Homesteading
“In the enthusiasm of our rapid mechanical
conquests we have overlooked some things. We have perhaps driven men into the
service of the machine, instead of building machinery for the service of man.
But could anything be more natural? So long as we were engaged in conquest,
our spirit was the spirit of conquerors. The time has now come when we must be
colonists, must make this house habitable which is still without
character.
”
permalink
* *
Jun 18, 2012
How to use a profiler
All of us programmers have at some point tried to speed up a large program.
We remember "measure before optimizing" and profile it, and end up (a few
hours later) with something intimidating like this and.. what
next? If you're like me, you scratch your head at the prospect of optimizing
StringAppend, and the call-graph seems to tell you what
you already know: Your program spends most of its time in the main loop,
divided between the main subtasks.
I used to imagine the optimization process like this:
1. Run a profiler
2. Select a hot spot
3. ...
4. Speedup!
But the details were hazy. Especially in step 3. Michael Abrash was clearly
doing a lot more than this. What was it?
Worse, I kept forgetting to use the profiler. I'd have a split-second idea and
blunder around for hours before remembering the wisdom of "measure before
optimizing." I was forgetting to measure because I was getting so little out
of it, because I'd never learned to do it right.
After a lot of trial and error in the last few months, I think I have a
better sense of the process. Optimization is like science. You can't start
with experiments. You have to start with a hypothesis. "My program is spending
too much time in _." Fill in the blanks, then translate the sentence for a
profile. "I expect to see more time spent in function A than B." Then run the
profile and check your results. Skip the low-level stuff, look for just A and
B in the cumulative charts. Which takes more time? Is one much more of a
bottleneck? Keep an eye out for a peer function that you hadn't considered,
something that's a sibling of A and B in the call-graph,
about the same level of granularity.
Do this enough times and you gain an intuition of what your program is doing,
and where it's spending its time.
When you do find a function at a certain level of granularity that seems to be
taking too long, it's time to focus on what it does and how it works. This is
what people mean when they say, "look for a better algorithm." Can the data
structures be better organized from the perspective of this function? Is it
being called needlessly? Can we prevent it being called too often? Can we
specialize a simpler variant for 90% of the calls?
If none of that generates any ideas, then it's time to bite the
bullet and drop down to a lower level.
But remember: optimization is about understanding your program. Begin there,
and profiling and other tools will come to hand more naturally.
permalink
* *
Mar 14, 2012
The opposite of exponential backoff
Every few days I'm on IM with some coworker from a different building. The
conversation goes something like this:
Them: Lunch?
(6 minutes)
Me: Sounds good! Meet downstairs?
(1 minute)
Me: Now?
(2 minutes)
Them: Yes!
(1 minute)
Them: You there?
(3 minutes)
Me: Oops, sorry. Heading down now.
Now I never know whether to wait for a response, or to run down because I'm
already late. But today I finally figured out the answer:
Me: New rule. Head down when both of us say 'ready' within 1 minute of each
other.
That's it. No more ambiguity. It's kind of a silly example, but the general
idea feels deep, complementary somehow to exponential
backoff. Exponential backoff is the ideal game-theoretic strategy for
competitive situations where two people need to contend for a common resource,
like trying to call someone back after a dropped call and getting a busy
signal. Both try to diverge away from a network-defined window of conflict by
waiting longer and longer. Here both parties are trying to converge into an
agreed window of agreement. Defining the size of the window by diktat should be
handy anytime two parties (people, computers, ..) need to cooperate
synchronously atop an asynchronous channel.
I'm sure there's a paper on this idea, perhaps something like that one by Lamport
pdf on Buridan's
ass. If you have a pointer I'd love to hear about it.
permalink
* *
Jul 14, 2011
Evolution of a rails programmer
Idiomatic rails action for registering a user if he doesn't exist:
After a year of programming in lisp, I find it most natural to write:
Is this overly concise/obfuscated? I like it because it concisely expresses
the error case as an early exit; most of the space is devoted to the
successful save, which is straight-line code without distracting branches.
It's clearer that we either pick an existing user or create a new one. Form
follows function.
permalink
* *
May 12, 2011
My latest project
Two of us have been building
hackerstream, a real-time UI for Hacker News that a dozen addicts have
been using since March. This
won't matter to you if you don't frequent HN, or even if you swing by just
once a day. But try it out if you go to HN every couple of hours like I do.
You'll see comments on all the stories on the entire HN frontpage as they
stream in. You can slice and dice the stream by story or by author. You can
even set it up to highlight, say, all comments by security guy Thomas Ptacek and
all comments about today's silicon valley
brouhaha. If you use Twitter the UI will seem familiar.
Is such a firehose useful? Is it too much of a good thing? I find I see more
of HN for the same time investment, and I waste less time scanning stories
I've already read. What's more, when I started using it I found my comments
getting more votes and more responses. It turns out the biggest factor
affecting responses is not how good my comment is but just how early it shows
up on a story. By biasing my reading to be more timely I was giving my few
comments improved odds of a response.
It's not for everyone, but hackerstream is geared to
help you have higher-quality conversation on Hacker News. Try it out and tell me what you think.
permalink
* *
Mar 4, 2011
Backup theory for startups
You run a startup. Your company has been given data by users, data that it
would be embarrassing to lose. You make backups. You're aware of the
best-practices:
Make backups.
Make backups automatic.
Or they won't happen.
Make backups yourself.
Outsourcing them is for suckers.
Regularly restore
from your backups. A backup doesn't exist if it's never read.
Incomplete
This paranoia is useful, but it's intended for personal data. For business
data it's incomplete. Businesses have automation. Lots of automation. Dumb
automation that can do stupid things, like corrupting data. Businesses also
develop nooks and crannies where important data may go unread for periods of
time. The combination of automation and dusty corners can cause
insidious data loss: some obscure-yet-critical corner of your data
gets deleted or corrupted, and you don't notice until the corruption has
infected the backup copy.
Stale is good
You can guard against catastrophic data loss with just a regular backup at a
remote location. Insidious loss is harder to guard against. Insidious data
loss is the reason companies have more than one backup, the reason journalled
file systems have multiple .snapshot directories, the reason
slicehost and linode provide a daily and a weekly backup. Daily/weekly is
perhaps the simplest backup cascade. It gives you a week to detect
corrupted data in your server. As operations get complex you'll want longer
cascades with more levels. The lower levels backup frequently to capture
recent changes, and higher levels backup less frequently as a cushion to
detect data corruption.
More best-practices
Make cascading backups.
Make sure you can detect corruption to anything important before it
reaches the highest level of your cascade. The 'height' of your cascade
defines how far back in time you can undo damage.
Failure modes
Whatever your backup strategy, ask yourself what it can't handle. I can think
of two scenarios cascades can't handle: extremely insidious corruption that
doesn't get detected in time, and short-lived data. If many records in your
database get deleted everyday, most of them may never make it into a weekly
backup.
What else?
permalink
* *