[Biopython-dev] test_GASelection hangs

Peter biopython at maubp.freeserve.co.uk
Mon Nov 17 21:49:02 UTC 2008


Bruce wrote:
> Peter wrote:
>> However, this clearly gets stuck in TournamentSelectionTest - so we've
>> narrowed this down a bit.  Reading that bit of code, there is an
>> apparent risk of an infinite loop if by chance org_1 happens to be the
>> worst organism in the population.  Perhaps adding a simple counter to
>> break out of the loop if after 1000 tries org_1 is still the worst -
>> but I'm not sure what to do then.
>>
>> Peter
>
> Hi,
> I ran the test multiple times using a bash loop and I think I tracked down
> this specific problem to within the actual test code, specifically the
> function TournamentSelectionTest.t_select_best(). I think this what Peter
> noticed.

Yes, this was what I was describing.

> This is how I understand things which I hope is sufficient correct to
> understand it.
>
> The test simulates a genome that has 3 locations with the 4 bases coded
> as '0', '1', '2', and '3' for an 'organism'.  (Note the 3 locations is hard
> coded into the random_genome function.) The calculation of fitness of an
> organism is just the integer of the coded values do the first position is
> hundreds, the second is tens and last is ones.
>
> In the TournamentSelectionTest.t_select_best, a second organism is simulated
> that must have a better fitness than the first. The problem comes is when
> the simulated genome of the first organism is '000' because the fitness is
> zero. This creates an infinite loop because the line :
>           if org_2.fitness < org_1.fitness:
> will always to false but eventually this must be true to break the loop.
> Obviously this loop becomes infinite and, given that there are only three
> locations, it should be rather frequent.

Yes.

> Is it sufficient to use the condition '<='?

No, I don't think so.  The point of the setup seems to be to look for
a pair of organisms where one is measurably fitter than the other (and
make sure the better one is indeed selected).

> Alternatively, is there someway to fix the genome of the first organism
> rather than a random one?
> For example, instead of the random_organism() declare it as say:
> org_1=Organism('100', test_fitness)

We could do something like:

#Choose anything except the worst organism, "000",
while True :
    org_1=random_organism()
    if test_fitness(org_1) > 0 : break

[Not tested yet]

This at least is more or less random.

Peter



More information about the Biopython-dev mailing list