[Biopython-dev] test_GASelection hangs
Bruce Southey
bsouthey at gmail.com
Mon Nov 17 20:03:54 UTC 2008
Peter wrote:
> On Mon, Nov 17, 2008 at 6:35 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>
>> Hi,
>> I was just running the test under a very fresh cvs version and under
>> Python2.3 the test was hanging with test_GASelection. Of course, there was
>> no problem after killing it and rerunning the test. I think this also
>> pertains to bug 2651 so I thought I would ask if there was a way to examine
>> this further before doing anything else. I understand that this is problem
>> with randomization involved, but it does indicate a more subtle problem is
>> present. I would really like to track down the source of the problem.
>>
>> Does anyone have any ideas on how I could try to examine this further?
>>
>
> If you have installed CVS (or indeed any recent version of Biopython,
> as the GA stuff hasn't changed recently IIRC), then in the Tests
> directory you can just run:
>
> $ python test_GASelection.py
>
> You'll find sometimes it gets stuck. I tried modifying the file so
> that the end reads as follows:
>
> if __name__ == "__main__":
> #sys.exit(run_tests(sys.argv))
>
> ALL_TESTS = [DiversitySelectionTest, TournamentSelectionTest,
> RouletteWheelSelectionTest]
>
> runner = unittest.TextTestRunner(sys.stdout, verbosity = 2)
> test_loader = unittest.TestLoader()
> test_loader.testMethodPrefix = 't_'
>
> test=ALL_TESTS[1] #Edit me: 0, 1 or 2
> cur_suite = test_loader.loadTestsFromTestCase(test)
> count = 0
> while True :
> count += 1
> print "#"*50, count
> runner.run(cur_suite)
>
> On my machine, DiversitySelectionTest and RouletteWheelSelectionTest
> seem safe - the tests just run and run until you interrupt them with
> ctrl+c.
>
> However, this clearly gets stuck in TournamentSelectionTest - so we've
> narrowed this down a bit. Reading that bit of code, there is an
> apparent risk of an infinite loop if by chance org_1 happens to be the
> worst organism in the population. Perhaps adding a simple counter to
> break out of the loop if after 1000 tries org_1 is still the worst -
> but I'm not sure what to do then.
>
> Peter
>
>
Hi,
I ran the test multiple times using a bash loop and I think I tracked
down this specific problem to within the actual test code, specifically
the function TournamentSelectionTest.t_select_best(). I think this what
Peter noticed.
This is how I understand things which I hope is sufficient correct to
understand it.
The test simulates a genome that has 3 locations with the 4 bases coded
as '0', '1', '2', and '3' for an 'organism'. (Note the 3 locations is
hard coded into the random_genome function.) The calculation of fitness
of an organism is just the integer of the coded values do the first
position is hundreds, the second is tens and last is ones.
In the TournamentSelectionTest.t_select_best, a second organism is
simulated that must have a better fitness than the first. The problem
comes is when the simulated genome of the first organism is '000'
because the fitness is zero. This creates an infinite loop because the
line :
if org_2.fitness < org_1.fitness:
will always to false but eventually this must be true to break the loop.
Obviously this loop becomes infinite and, given that there are only
three locations, it should be rather frequent.
Is it sufficient to use the condition '<='?
Alternatively, is there someway to fix the genome of the first organism
rather than a random one?
For example, instead of the random_organism() declare it as say:
org_1=Organism('100', test_fitness)
Bruce
More information about the Biopython-dev
mailing list