Mutation-resistant code

I heard an interesting lecture today by Richard Bonneau about his work using machine learning to analyze genetic codes and also to understand interactions between the proteins they express. He said many fascinating things, and my head is still reeling from all of the exciting ideas and possibilities.

But one thing in particular struck me, from my perspective as somebody who programs computers. He talked about the rich multiple interconnections between different parts of how our biology functions at the molecular and cellular level.

The redundancies built into the interactions between these various components creates a very robust system. One effect of this is that our genetic code is remarkably resistant to damage.

In other words, the great majority of gene mutations turn out to be harmless. Thinking of the cybernetic equivalent, imagine that your computer program was modified by somebody flipping random bits in its binary image.

How many bits would need to flip before the program stopped working? In most cases, even one bit would suffice to crash the program.

But in the biological equivalent, you can flip lots of bits and the whole thing continues to run just fine. Which led me to start thinking about the following hypothetical problem:

Could you design a computer program that would be resistant to random mutations? In particular, could you design it so that many bits in its binary image could be randomly flipped, and the program would continue to run?

We can define a program’s “robustness” as the number of bits you would need to flip, on the average, before the program breaks. So how to you make a program robust?

One problem is that it would be very easy to cheat. You could just make a million copies of the program, and then toss out whichever parts of any copies don’t match the corresponding parts of other copies.

So let’s be more specific: Is there a way to minimize the size of a computer program for any given degree of desired robustness?

8 thoughts on “Mutation-resistant code”

  1. Neural networks can be very tolerant of errors. Unfortunately we don’t really have a good way to “design” them short of randomly evolving them.

  2. Good point. If we replace “computer program” by “procedure that maps a given set of inputs to a given set of outputs”, I wonder whether neural nets can be near to optimal, in the sense that they minimize size for any given degree of desired robustness.

  3. Data storage systems like hard drives and SSD’s are already operating this way. The bits you get back when you open a file aren’t what was actually stored on the medium, but instead a reconstruction of the most probable bits from a collection of many more redundant ones.

    Here’s a deep dive into all the error correction complexity in a simple SD card:
    http://joshuawise.com/projects/ndfslave

  4. Do you know what the mapping is for such devices from “how many extra hidden bits do I need per user bit?” to “how reliably can I guarantee that no user bits will be wrong?”

  5. By the way, I find the SSD answer unsatisfying in terms of my original question. It’s one thing to create a way to protect a fixed piece of source code from change (ie: by adding error correction bits). It’s another thing entirely to create source code that continues to do its job even in the face of random changes to the source code itself. The latter problem seems so much more intriguing.

  6. In any event, the fundamental programming paradigm of “here’s a list of instructions with deterministic consequences” is not going to cut it here. Biological systems work great because they have billions of years of evolution via the mechanism of mutation behind them. A small mutation is likely to make it end up in a stable state because it’s built on top of billions of years of small mutations that lead to stable states.

    So, neural nets work great because we build them in the first place via a mechanism of “let’s randomly mutate things until they behave the way they want,” and thus a small mutation only leads to a small drop in fitness… Or possibly even an increase in fitness, who knows!

    The downside is that neural networks as we build them have to be implemented as deterministic programs running on deterministic hardware. So we’re back to square one a little bit, until somebody starts coming out with hardware specifically designed for accelerating neural network processing.

  7. Naively, I assumed that neural networks were designed as a probabilistic
    rather than a deterministic paradigm. Wouldn’t you need s probabilistic model for a successful neural network? Doesn’t the human body actually execute as a probabilistic model?

Leave a Reply

Your email address will not be published. Required fields are marked *