Remine's Dilemma, OR size DOES matter.


There has been some discussion in newsgroups talk.origins and sci.bio.evolution over the last year or so regarding Walter ReMine's claims that a version of Dawkins "weasel" program, which demonstrates the efficacy of selection, nevertheless demonstrates a serious limitation on the rate of evolution, in agreement with Haldane's Dilemma. Mr. ReMine claims that this rate is so low that this makes the current accounts of human origins (and other higher vertebrates) implausible (see http://x38.deja.com/getdoc.xp?AN=294263986 and http://x38.deja.com/getdoc.xp?AN=315954300).

Quoting from the first reference:

For those interested in computer simulations of evolution, my book dismantles the most widely known example, the "METHINKS IT IS LIKE A WEASEL" simulation from Dawkins' book, _The Blind Watchmaker_. It identifies many unrealistic assumptions in the simulation that favored evolution. Some of the key ones Dawkins did not tell his readers about. For example, his simulation used a reproduction rate that would require females to produce 200 offspring each, (a rate which is not plausible for most higher organisms). The book shows how the simulation actually demonstrates the phenomenon and problem of Haldane's Dilemma. For those serious about this issue, I can recommend no better place to start than there.

Now, I can't for the life of me see how any Dawkins stlye "blind watchmaker" program will reproduce Haldane's dilemma, as Haldane's dilemma is about the rate at which multiple benefical alleles of multiple genes (typically > 20,000), get substituted in a population (see http://www.gate.net/~rwms/haldane1.html or "Natural Selection", George C Williams, 1992, Oxford University Press, chapter 10 for more details of Haldane's Dilemma). The basic idea is that rate at which two (or more) beneficial mutations will be substituted (that is, the number of generations that it takes to go from the mutation to being rare to the mutation being present in all organisms of that population) is the same whether the mutations occur simultaneously or one after the other.

How a Dawkins style selection program, with only one "gene" and no population structure, could show Haldane's Dilemma is not clear. Clearly many other people thought that, including Wesley R. Elsberry, who produced the Perl program "weasle.pl" to explicitly explore this claim (see whale.htm for the source), and Robert Williams, whose site I referenced above.

One problem is the difficulty in finding details on the program, as the program is not Dawkins original, which is apparently lost, but another implementation. When asked on internet fora for details of this program , Mr.ReMine referred questioners to his book, "The Biotic Message" (http://www1.minn.net/~science), a self-published volume that is relatively hard to come by. (for example, see http://x24.deja.com/getdoc.xp?AN=339410120)

Fortunately, Will Pratt of the University of Nevada found a copy, and as a result Robert Williams was able to get a copy of David Wise's program MONKEY (see whale.htm for the source), which ReMine used. As I also noted earlier, after running it several times, and looking at the code, I couldn't find any hint of Haldane's Dilemma in the program. What was I missing?

The answer is actually blindingly simple, and is an embarrassment to Mr. ReMine. Here are relevant excerpts from Mr. ReMines book.

p 235

"That method of mutation is not true to nature [used by Dawkins]. In nature nothing counts mutations and assures exactly one in each progeny. A more realistic type of mutation should be used in the simulation so that each letter has a probability of mutation. Suppose we use this correct method of mutation while leaving the "average" rate unchanged (at 1 chance in 28). This subtle correction to the simulation nearly doubles the time needed to evolve the target phrase: to 86 generations."

p 236

"Then we reduce the reproduction rate to that of the higher vertebrates, say to n=6. In a sexual species this would require the females to produce 12 offspring each. This is overly optimistic for many species. The simulation then goes into error catastrophe and does not reach the target phrase. We can eliminate the error catastrophe by lowering the mutation rate.'

"Then by exploration we can find the mutation rate that produces the fastest evolution. [footnote: in this case the optimum mutation rate is one in 56.] With this optimal mutation rate, on average, the target phrase is reached in 1663 generations - that is 62 generations per substitution.'

"Thus the simulation - with its numerous unrealistic assumptions that favor evolution - is less than five times faster than Haldane's estimate of 300 generations per substitution. Ironically, this suggests that Haldane was too optimistic about the speed of evolution."

Can you see where ReMine has made his error? I actually wasted a couple of hours comparing the effects of mutation rates on different programs before I realized it, but it should have been blindingly obvious (so I'm stupid, Okay, but I've done two simulations WEASLE5.BAS and WEASLE6.BAS which incorporated population structures, and I thought that David Wise had done the same).

Here's the key line:

"Then we reduce the reproduction rate to that of the higher vertebrates, say to n=6"

Well knock me down with a stick of mortadella and call me Jake. Mr. ReMine doesn't know how these programs work! In the vast majority of weasel simulations, including Wise's, the program takes a string, makes x copies of it with single letter mutations in one or more copies of the string, then chooses the best string and makes x copies of that with mutations, then chooses the best string from those copies, and makes x copies of that. The process is repeated ad infinitum until the target string is reached. In many of these programs, the value x is a user entered variable called "number of offspring" or similar wording.

Now, Dawkins program ISN'T a simulation of natural selection in all possible organisms, but an illustration of directed selection in an artificial situation whose closest analogues are viral replication and RNA strand replication. Mr. ReMines claim that Dawkins's program (or any of its implementations) models vertebrate reproduction is simply risible. But it does model things like selection for RNA aptamers, where RNA stands are mutated, the best functioning copy is kept, and multiple copies of that single strand (usually thousands of copies) are made (see Jaschke A. RNA-catalyzed carbon-carbon bond formation. Biol Chem. 2001 Sep;382(9):1321-5.). Given this, reducing the "reproduction rate" to that of higher vertebrates is questionable at best, but it gets worse.

The important thing to note is that in Wise's program, Dawkins original, Wesley Elseberry's weasle.pl and my WEASLE4.BAS (see whale.htm) the "reproduction rate", ie number of offspring, IS ALSO THE POPULATION SIZE!!!! Of course you will see only slow convergence to the target phrase in any of these programs when you only have 5 offspring, as there is only a TOTAL POPULATION of 5 strings at any one time! [1]

Of course, in the real world, most organisms have populations of more than 5 individuals :-). Trying to compare the appearance rate of beneficial mutations in a population of 5 individuals with the substitution rate of beneficial mutations in a population of between 10,000 to 100,000 individuals is a pretty big blunder to make, even allowing for the other problems in trying to compare this program with a real population. The information about "offspring" number isn't hidden, it's clear in the description given by Dawkins and in David Wise's documentation.

Let me emphasise this, what ReMine is showing ISN'T the substitution rate. It's the mutation rate. It is important to note that Haldane's dilemma is about the rate at which multiple beneficial mutations are substituted in a population. That is the rate at which mutations go from being rare to being present in 100% of all organisms in the population. The basic "dilemma" being that two simultaneous beneficial mutations will go to fixation as fast as two mutations happening in sequence. In the these programs, the substitution rate is ALWAYS 1, that is any beneficial mutation or group of beneficial mutations automatically goes to fixation in one generation, and cannot ipso facto demonstrate Haldane's dilemma. All ReMine is showing is that when you have an absurdly low population size, the rate of appearance of beneficial mutations is relatively low. That he doesn't understand that he is not showing anything about Haldane's dilemma speaks volumes about his understanding of the problem.

ReMine's argument totally collapses. Despite the old adage, size, when it is population size in population simulations, does matter.


1. Think about it, if each string really did have 6 offspring per generation, the string space on any known program would be exhausted after only a few generations as the number of strings increases exponentially. In my programs, WEASLE5.BAS and WEASLE6.BAS, where I do have this kind of reproduction, I use a function to "kill" strings so that the population doesn't exceed 250 (or 500).


(Appendix: Wise's subroutine that generates the "population" from offspring number)


PROCEDURE spawn;

{  SPAWN copies the "parent" string (index 0) into each "offspring" string and
   then performs the selected "mutation".

    NOTE:  in each case, the letter position to be changed is selected at
           random AND the letter to be placed there is also selected at
           random.   }

VAR
  i, j, k :INTEGER;

BEGIN
  FOR i:=1 TO num_copies DO
    BEGIN
    s[i] := s[0];                 { Copy the "parent" }
    IF (option <> 3) OR (i > 1)   { Do not change if first child and option }
      THEN                        {    is 3 (one child remains unchanged) }
        CASE method OF
          1 : s[i,(Random(msg_size)+1)] := letter_pool[Random(pool_size)];
          2 : FOR k:=1 TO num_changes DO
                  s[i,(Random(msg_size)+1)] := letter_pool[Random(pool_size)];
          3 : FOR j:=1 TO msg_size DO
                IF Random < mut_prob
                  THEN
                    s[i,j] := letter_pool[Random(pool_size)];
        END;  {CASE}
    diffs[i] := diff(s[i]);
    END;  {FOR i:=1 TO num_copies DO}
  i := min_diff;                  { find offspring closest to target }
  s[0] := s[i];                   { make the offspring the next parent }
  diffs[0] := diffs[i];           { save its difference for display  }
END;


email:Ian Musgrave, at reynella@mira.net
Created: Friday, 17 September 1999, 18:23:59
Last Updated: Friday, 17 September 1999, 18:23:59