Evolution in a nutshell

Unfortunately the great article by Gregor Kjellström isn't available anymore, so here's a version of it.

The traveling salesman problem

The salesman should visit a number of towns, one at a time, and wants to know in what order they should be visited in order to make the tour as short as possible.

Suppose that the number of towns is = 60. For a random search process, this is like having a deck of cards numbered 1, 2, 3, ... 59, 60 where the number of permutations is of the same order of magnitude as the total number of atoms in the universe. If the hometown is not counted the number of possible tours becomes 60*59*58*...*4*3 (about 10 raised to 80, 1080, 1. e. a 1 followed by 80 zeros).

Suppose that the salesman does not have a map showing the location of the towns, but only a deck of numbered cards, which he may permute, put in a card reader - like in the childhood of computers - and let the computer calculate the length of the tour. The probability to find the shortest tour by random permutation is about one in 1080 so, it will never happen. So, should he give up?

No, by no means, evolution may be of great help to him; at least if it could be simulated on his computer. The natural evolution uses an inversion operator, which - in principle - is extremely well suited for finding good solutions to the problem. A part of the card deck - chosen at random - is taken out, turned in opposite direction and put back in the deck again like in the figure below with 6 towns. The hometown (nr 1) is not counted.

If this inversion takes place where the tour happens to have a loop, then the loop is opened and the salesman is guaranteed a shorter tour. The probability that this will happen is greater than 1/(60*60) for any loop if we have 60 towns, so, in a population with one million card decks it might happen 1000000/3600 = 277 times that a loop will disappear.

I have simulated this with a population of 180 card decks, from which 60 decks are selected in every generation (using MATLAB, the language of technical computing). The figure below shows a random tour at start.

After about 1500 generations all loops have been removed and the length of the random tour at start has been reduced to 1/5 of the original tour. The human eye can see that some improvements can be made, but probably the random search has found a tour, which is not much longer than the shortest possible. See figure below.

In a special case when all towns are equidistantly placed along a circle, the optimal solution is found when all loops have been removed. This means that this simple random search is able to find one optimal tour out of as many as 1080. This random process is also similar to evolution in the sense that it uses random variation and selection in cyclic repetition. This also means that the cyclic repetition of random variation and selection of individuals is a very important principle for creating a huge amount of information. So generally, there is no reason to distrust random developmental processes.

The given example shows how random search can solve a combinatorial problem efficiently.

However, I think that  the meaning of inversion is different in the natural evolution.

About my background

In the middle of the 60-ties, I worked at a Swedish telephone company with analysis and optimisations of signal processing systems. Formerly such systems consisted of interconnected components such as resistors, inductors and capacitors. I retired in 1993.

In the late 60-ties my boss formulated a technical problem: “Try to find system solutions that are insensitive to variations in parameter or component values due to the statistical spread in manufacturing” he said. This means that he wanted the manufacturing yield maximized.

If we have only two components - each having a parameter value – the problem is very simple. Let the first parameter value be the shortest distance to the left edge of a picture (below) while the second value is the distance to the bottom edge. Then, if the interconnection is given, a point in the picture represents the system unambiguously.

Suppose now that all points inside a certain triangle (region of acceptability, marked by red edge) will meet all requirements according to the specification of the system, while all other points does not, and that the spread of parameter values is uniformly distributed over a circle (green). Then, if the circle touches the three sides of the triangle, the centre of the circle would be a perfect solution to the problem.

But if we have 10 or 100 parameters, then the number of possible parameter combinations becomes super-astronomical and the region of acceptability will not possibly be surveyed. I begun to think that the man was not all there.

The problem was almost forgotten until a system designer entered my room about half a year later. He wanted to maximize the manufacturing yield of his system that was able to meet all requirements according to the specification, but with a very poor yield.

Oh, dear! I would not like to get fired immediately. So, we wrote a computer program in a hurry, using a random number generator giving Gaussian (normally) distributed numbers according to the bell curve to the left in the figure below. A cluster of points in two dimensions - where each pair of two normally distributed parameters is represented by a point - is seen to the right.

The system functions of each randomly chosen system were calculated and compared with the requirements. In this way we got a population (generation) of about 1000 systems from which a certain fraction of approved systems was selected. For the next generation the centre of gravity of the normal distribution was moved to the centre of gravity of the approved systems and this process was repeated for many generations.

After about 100 generations the centres of gravity reached a state of equilibrium. Then the designer said “but this looks very god”. And we were both astonished, because we had only put some things together by chance. A closer look revealed that there is a mathematical theorem valid for normal distributions only stating:

If the centre of gravity of the approved systems coincides with the centre of gravity of the normal distribution in a state of selective equilibrium, then the yield is maximal (theorem 6.2.2).

This gave an almost religious experience. Here a mathematical theorem solved a difficult problem without our knowledge and independently of the structure of the region of acceptability. But in order to fulfill the theorem exactly, infinitely many random points must be generated, which is of course impossible. Nevertheless the solution was good enough for our technical purposes. Our very simple process was also similar to the natural evolution in the sense that it worked with random variation and selection.

Darwinian evolution: Later it turned out that this is not very far from the Darwinian evolution of natural systems, which is my main concern today. The analogue to manufacturing yield was the mean fitness determined as a mean over the set of individuals in a large population. Already here a connection between mean fitness and the spread in parameter values is clearly seen. More generally the spread in parameter values is an analogue to the disorder in morphological characters.

Looking at the triangle and the circle above it is clear that a small arbitrary displacement of the circle causes mean fitness to decrease, but may be taken back again if the radius of the circle is decreased, i. e. if the disorder of the morphological characters is decreased. This means that mean fitness and disorder may be simultaneously maximal even if the distribution of parameters deviates from normal.

More generally the theorem of normal adaptation may be proved in two different ways leading to the following more general formulation of the theorem: A normal distribution may always be adapted for maximum mean fitness and a corresponding maximum disorder (average information) to any region of acceptability (theorem 6.2.3). The condition of optimality is that the centre of gravity of the normal distribution coincides with the centre of gravity of the survivors, i. e. parents to offspring in the next generation.

Note, that this is in contrast to the mean fitness earlier defined as a mean over the set of genes in the gene pool, which led to the dubious “fundamental theorem of biology” due to Fisher (1930) because it does not consider the simultaneous maximization of disorder.

Neural networks: Because nerve cells may in principle add and multiply signal values and because many researchers agree that an evolution of signal patterns is going on in our brains, digital circuits (neural networks) would perhaps simulate an evolution of signal patterns in certain parts of the central nervous system. In fact, I have also proposed a very simple digital circuit as a model of the evolution in the brain.

Creationism and the order in nature

Mayr: What Evolution is, 2001, states that “it is sometimes claimed that evolution, by producing order, is in conflict with the ‘law of entropy’ of physics, according to which evolutionary change should produce an increase of disorder. Actually there is no conflict, because the entropy law is valid for closed systems only, whereas the evolution of a species of organisms can reduce entropy at the expense of the environment and the sun supplies a continuing input of energy.”
I have a different view of this. Creationists are right in the sense that random events do not produce order. But they have produced an enormous amount of disorder represented by millions of different species and billions of different individuals in certain species, in agreement with the entropy law. Because a more widespread gene pool is more disordered. The order in the biologic sphere was biggest when the first living organism ruled the roost. Disorder/entropy may also be called biological diversity because – as I see it – there is no reason to distinguish between disorder and diversity because it is the same random evolution, giving rise to both.

The illusion of order in the biologic sphere is due to the fact that only a very tiny little fraction of all possible DNA-messages may manifest themselves as living organisms. Thus, the disorder becomes restricted, and this restricted disorder is interpreted as order by both creationists and biologists. Intuitively, this may be understood, if we observe that the duality order-disorder is like cold-warmth. Actually there is no cold, only limited warmth. Likewise, there is no order, only limited disorder.

So, for our purposes, evolution may be seen as a random process climbing a phenotypic landscape, ruled by all restrictions, and which will completely determine the shape of living organisms. The landscape is completely dependent of the almighty laws of nature (in the sense that they are valid throughout the whole universe), properties of DNA molecules, proteins etcetera, whose origin is not known.

If it were possible to prove that the electro-magnetism, for instance, is a product of some random process, then we have perhaps proved that there is no god. But, to my knowledge, there is no such proof. So, the only way for me to believe in some God is if it is possible to believe that God created the laws of nature inclusive the entropy law and the evolution. But this is by no means self evident.

Science and faith

Let us first point out that modern mathematics and science supports freedom of religion, but hardly any kind of fundamentalism. For instance, if some person at a nonconformist meeting rises proclaiming that “great is the power of faith”, this is de facto scientifically proved by the placebo effect. If I eat a sugar pill convinced that it contains a wholesome medicine, then I will sooner be whole, even if not all illnesses may be cured that way. Similarly, the Commandments of God and the Sermon on the Mount in the Bible certainly promotes survival. I am also sure that the Koran (even if I am not familiar with it) includes many such passages. This means that certain passages in the Bible and the Koran may promote survival. Nevertheless, I see any faith that may promote the survival of mankind - and my own survival as an individual and a member of the society - as a good faith.

There is also detrimental faith. For instance, if a large number of individuals in society really believe that on Friday, the 13th, the risk of unforeseen accidents is higher than usual, then such accidents will more probably occur. In my opinion, such faith is bad faith or superstition detrimental to survival.

One of the most important mathematical discoveries during the 20:th century was the theorem of Gödel (1931) stating that not even all mathematical problems can be solved mathematically. Or in other words: Our knowledge will always be limited and there will always exist true statements that can never be proved. And in such statements we can only have faith - or no faith.

In my opinion, this theorem also supports the freedom of religion. It also means that no zealot has any right to coerce anyone to accept any religious dogmas or to call any person unbelieving, because outside the realm of knowledge everyone has to believe in something, which is not necessarily identical with the zealot’s God.

It will be argued that the Darwinian evolution of natural systems is a process proceeding by phenotypic (genetic) random variation and selection in one generation to the next. It will also be seen as something climbing a phenotypic value landscape. Even though the climbing takes place at random, it does not necessarily mean that the landscape is a result of some random process. It entirely depends on the laws of nature, gravitation, electro-magnetism, the properties of nuclear particles etcetera, whose origin is not known. So, evolution is perhaps not a pure random process, after all. Besides, it has a history and a memory, and the information stored in the DNA-messages may be used to improve on the efficiency of the process.

Another problem is that probability theory is only a human invention and must not necessarily have anything to do with reality. On the other hand, the probability theory makes it possible to produce very good models of real processes, that may also be simulated on computers.