Evolutionary Biology and Software Engineering

Tue Nov 23, 2010

For my first blog post, I’ve decided to cheat. About a year ago, I wrote an article, ACIS as an Ecosystem, for our company newsletter. In this article, I presented the idea that “the functions within a large commercial software package form an ecosystem, in the technical sense of a collection of evolving actors which interact among themselves." At the time, I wanted to go into more detail about the evolution theory behind this statement (because it’s both rigorous and really cool), but was limited by space. This blog entry allows me to indulge myself by expanding on this article.

The story begins in Auckland, New Zealand in 1995, where my girlfriend and I were stranded when our luggage (and her passport) was stolen after a physics conference. I was reading the chapter of The Selfish Gene where Richard Dawkins introduced the concept of memes. In the memes chapter, he describes the requirements for a set of actors to form an evolutionary system:

  1. There must be a system for duplicating actors.
  2. The duplication must not have 100% fidelity, i.e. “duplication with errors”
  3. There is some form of selection pressure on the actors.

This idea of abstract evolutionary systems blew my socks off; and it still does today. Dawkins used it to point out that ideas and ideologies (memes) evolve, but it can also be used to understand evolutionary systems as incredibly powerful (and counter-intuitive) optimization engines.

This is relevant to 3d development because the “functions” (actually, probably a smaller unit) in the source code control system of a large application obey the rules for an evolutionary system:

  1. A typical, morally just, way for a function to be duplicated is for it to be edited for a bugfix or enhancement. The new version of the function is a copy with less than 100% fidelity with respect to the original. A less morally just (but probably equally common :) way for a function to be duplicated is through cut and paste programming.
  2. Functions are usually changed when they’re copied.
  3. Bugfixes and enhancements provide selection pressure; old versions of functions are removed from the system and new ones introduced according to the pressures.

The second part of the story is described in more detail in the newsletter article. Basically, the inputs which are provided to a function are the outputs of other functions; the functions form an interrelated web. The reason that this is important is that the environment in which a function must operate is provided by the other functions in the application. If Function B is not robust against divide-by-zeros, and a change in Function A causes it to start supplying zero-values to B, then B’s environment has changed in a way that is likely to cause it to fail, which in turn will cause the A-B system to feel selection pressure when the bug reports start rolling in.

This is exactly analogous to a biological ecosystem. In a biological ecosystem, there are two broad categories of environmental influence which impinge upon a biological organism:

  1. Physical environment, which consists of non-biological influences such as temperature, rainfall, gravity, etc.
  2. Biological environment, which consists of other biological organisms such as grass, tigers, bacteria, etc.

If we think of biological actors as analogous to functions, then the biological environment is analogous to the web of functions. The physical environment, on the other hand, is analogous (I think) to the manner in which the application is run, i.e. the end users and the entire system of business practices which turn end-user feedback into changes to the code. In software, it is the physical environment analogue which supplies the selection pressure to drive out bugs.

Why do I think this is important? Because I’m a firm believer in the principle that insight is important. As a physicist, I look for models that describe the physical world; it appears that the ecosystem model is a good one for describing large, widely deployed software systems.

Models are useful in two ways: they provide intuition and they provide analysis tools. For an intuition example, consider what happens when a penguin (adapted to cold, oceanic climates) is transported to the middle of the Sahara (hot desert). The penguin is hit with a huge set of stressors to which it is poorly adapted. The intuition here is that similar non-robustness can be expected when using software applications in new ways, such as porting operating systems to mobile devices. For an analysis example, consider the relative roles and merits of unit testing vs. design-by-contract methodologies for ensuring quality. I’m planning this for a future blog entry, so I’ll leave it as an exercise for the reader for now. I will say, however, that the ecosystem model indicates that low-probability configurations of inputs are important.

Subscribe to the D2D Blog

No Comments Yet

Let us know what you think