stateful systems and the fitness floodplain: a lament

Stateful Systems and the Fitness Floodplain: A Lament

A "fitness landscape" is a topographical metaphor for evolutionary success. A latent space described by genes or physical features, if you will, where fitness, or the suitability of a genotype or morphotype to its own physical environment, corresponds to elevation. Peaks therefore represent combinations of genes or features with high suitability, valleys those which struggle or even fail entirely. The crab, that success story of genetic and convergent evolution alike, occupies peaks; the panda, a valley.

Genotypes and morphotypes are near to or far from each other on the fitness landscape by similarity across dimensions of interest. Crabs and pandas are fairly far apart in most comparisons. One's a decapodal crustacean scavenger, the other an ursid that traded in carnivory for a diet it can barely digest and reduced itself almost to sessility. Spiders generally wind up pretty close to crabs: close common ancestor, chitinous exoskeletons, minus a couple of limbs, plus a few eyes, book lungs instead of (usually) gills. Koalas similarly for pandas, with the fearsomely well-adapted cats staring down at both of them from the heights of the mammalian region.

Natural selection guides the species on the fitness landscape upward, becoming more fit -- ideally. Sometimes a dominant mutation takes more than it gives or overspecializes, and sends the species toward a fitness valley instead. Conversely, if a species not too well fitted for its surroundings is able to migrate physically into a new, more congenial area, it's also moved peakward in terms of fitness.

And like its denizens, the fitness landscape itself isn't static. It normally changes at a much slower rate than they do, which is what makes natural selection effective; sometimes, though, it does change much more rapidly. We call these times extinction events. All of a sudden, the environment oxygenates or deoxygenates or heats up or cools or acidifies or is overtaken by a species itself no longer subject to natural selection, which crowds out habitat and plows forage under. The fitness landscape quakes. Valleys are exalted, mountains and hills made low. Species highly fit for the old environment have likely specialized in ways that limit their tolerance for their new surroundings, and perish. Less specialized species fill the new gaps and, in the absence of fitter competitors or predators, have the opportunity to specialize themselves. The synapsid survivors of the Cretaceous-Paleogene extinction had previously been unable to challenge the dominant fitness of the dinosaurs; post-Chicxulub they thrived and ramified, finding varying levels of long-term success. We're here, cats are here, pandas have found a more suitable habitat in zoos run by a species that thinks they're cute, and koalas are still clinging to their last vestiges of independent suitability, but how many megatheria have you seen lately?

Anyway, software. Evolution itself is a favorite metaphor. Software systems evolve, gaining and losing features and functionality across versions or releases, maybe slowly, maybe quickly. However, the evolution of software has a very different guiding principle from that of species. Where mutation sends individuals in random directions on the fitness landscape and natural selection eventually winnows out those who head downward, software projects are instead built by people intent that the system's overall fitness¹ should always increase. Developers undertake cautious and considered traversals of the fitness landscape, targeting higher and higher peaks while minimizing downward travel.

It'd be nice if each rise led directly to the next in an unbroken line, wouldn't it?

Unfortunately, just as real mountains are separated by valleys, so too are peaks on the fitness landscape interspersed among lower terrain. In order to reach the next peak, to satisfy the next user need, to improve the fitness and therefore the odds of success of your software, you must, more often than not, travel through a valley: refactoring code for reusability, reorganizing data structures to accommodate future extension, unwinding assumptions that no longer hold and deprecating the affordances that depended on them, all the yak-shaving that inevitably accretes as a software system matures, its youthful flexibility ossifies, and the technological environment and market continue to change around it.

Iterative development processes work to keep valley crossings as short and as shallow as possible by introducing feedback loops, both at the build-and-test level and on longer cycles with regular or continuous delivery and frequent user input. The "fail fast" doctrine encourages performing rapid searches in many directions to rule out routes that pull downward, in effect bringing a kind of "natural" selection back into the picture to cull the less fit mutations. Prototyping and spiking even build valley-traversal into software development explicitly, on grounds that seeing the view from the next peak quickly is worth having to make a second trip up from base camp -- and that second climb might even be to a different, higher peak that only became visible from the first.

In stateless systems, this is all manageable, or at least as manageable as the codebase and its interface stability requirements. The opportunity cost of moving to the next peak isn't nothing, but whatever holds us back can often be abandoned as long as we can continue to satisfy user needs without it and nobody else depends on an interface we publish. It's also relatively easy to cut bait and backtrack if we suspect we're crossing an unacceptably deep valley or have climbed a local maximum that could endanger our long-term success. We can afford a fairly naive search of the landscape, using loose-coupling techniques to dodge the riskiest valleys and planning one ascent at a time, because if we find ourselves heading in an unpromising direction, we're out some time and have to throw out some code, but no more than that. The experience may even have taught us new things that we can immediately put to good use on the next ascent.

Stateful systems, meanwhile, must reckon not only with the ordinary inertia of design and code, and almost always with some set of commitments to interface stability in the form of an API or a data dictionary, but also with masses of stored information whose resistance to change cannot be circumvented in the process of migrating to an improved data model and which may even be incompatible with its constraints and expectations. Throwing out code and going back to the drawing board on the design of some subsystem may not be fun. Throwing out specifications and revising interface contracts is politically fraught at best, but can be done. Throwing out real, useful data just because it happens to omit newly-important properties or irreversibly collapses a distinction revealed to be crucial going forward is out of the question.

Iteration remains one of our most helpful methodological tools for its emphasis on controlled traversal and regular fitness checkpoints, but when even seemingly minor decisions can be impossible to reverse, a practice focused on iterating over more-but-smaller decisions exists in tension with the nature of the work. It's not merely that, in order to prepare for the next ascent, the system's maintainers have to haul the whole thing down into a valley instead of leaving the dead weight of outdated models and specifications behind. The valley is also flooded, a tide of precious and obstinate information washing across the landscape, pulling our careful descent off-course, blocking our access to certain peaks, and obscuring the terrain below.

Where are our boats?

1: How is software fitness determined? Satisfaction of explicit user needs; ease of use; reliability; consistency; conceptual simplicity for user and for developer. Really, it's all user needs, one way or another: for a software system to satisfy the explicit needs, it implicitly must also be convenient, reliable, consistent, and not more complicated than the problem. In the absence of an agreed "done" state, it must also admit further modification. The success of "worse is better" only shows that simplicity for developers beats consistency (in the sense of fidelity to a complex problem space) in a fair fight, or, if you're on the side of "better", that you don't get a fair fight.