[Bioperl-l] A better fix for Species.pm memory leak, please REVIEW.

Chris Fields cjfields at illinois.edu
Mon Nov 24 16:49:21 EST 2008


On Nov 23, 2008, at 8:31 PM, George Hartzell wrote:

> Chris Fields writes:
>> On Nov 23, 2008, at 6:19 AM, Sendu Bala wrote:
>>
>>> But anyway, thanks George, go ahead and commit if it all seems
>>> functional.
>>>
>>> If you could summarise your solution in the bug report(s?) as well,
>>> that would be great.
>>
>>
>> Agreed, though I think we need to ensure that memory leaks don't
>> permeate into Bio::Taxon when we make the switch (it shouldn't, I
>> think).  BTW, checking Blame/Annotate it looks like some of this  
>> stems
>> from r10974:
>>
>> http://code.open-bio.org/svnweb/index.cgi/bioperl/revision/?rev=10974
>>
>
> The change that I proposed wouldn't cause anything to permeate into
> Bio::Taxon.
>
> It's really a design problem with Bio::Species.  A Species "has-a"
> reference to a tree *and* when the Species creates that tree it asks
> the tree constructor to initialize itself using the Species itself
> which results in the tree including a reference to that Species
> object.  Et voila, circular reference.

My thoughts as well.

> It's not helped by the fact that Bio::Species explicitly disconnects
> all of the tree cleanup code that was supposed to handle those
> circular references.  On the other hand there are a bunch of comments
> that suggest that the cleanup code never worked right.

The cleanup code relies on the object being destroyed (either via  
garbage collection or explicitly).  Since there is a circular  
dependency and DESTROY isn't explicitly called, the root cleanup  
method won't work.

> ...
> I'm not in front of that machine at the moment, but I think that there
> are a couple of other places in the src tree that use the -node or
> -root arg's to Bio::Tree::Tree::new and should probably be checked for
> leaks.

A possible solution would be to make Bio::Species a shell or proxy  
object that decorates a Bio::Taxon instance (and a Bio::Tree::Tree  
where needed) and just delegates the appropriate methods to them.  It  
could still inherit from Bio::Taxon; it would just override the  
Bio::Taxon methods for delegation.  The Bio::Tree::Tree can refer to  
the Bio::Taxon object (and vice versa), but neither refer back to the  
Species object directly.

In this case, when the last ref to the Bio::Species is released it  
should be garbage collected and trigger DESTROY (which is set up in  
Bio::Root::Root).  This in turn triggers any cleanup methods, such as  
calling DESTROY on the Tree and Taxon and get rid of memory leaks (and  
wouldn't require weaken/is_weak).

I'm perfectly willing to test that out if no one else is (I am working  
on a few other bugs at the moment but I can probably get to it in the  
next day or two).  Not to tread on any toes, but I would like to get  
this one taken care of prior to 1.6 and these are fairly nasty memory  
issues.  If there is any consolation it shouldn't be too hard to re- 
rig the methods to delegate appropriately.

> I don't really feel hot or bothered about this fix vs. the combination
> of weaken and stashing the subspecies.  It depends on what the author
> intended for Species and its tree, neither are particularly sexy and
> if Species is really on its way out the door it may not matter.....
>
> g.

I don't think it will matter post-1.6, but we'll have to support  
Bio::Species for the 1.6 release series.  So it will have to be  
maintained for the short term, until 1.7 comes out.

chris


More information about the Bioperl-l mailing list