[Bioperl-l] Bug in Contig.pm? How to compare two sequence objects?
Robson de Souza
robfsouza at gmail.com
Sun Jul 25 10:42:35 EDT 2010
Hi Dan,
It is been a long time since I last loooked at this but, if I remember
correctly, the point is that Bio::
On Sun, Jul 25, 2010 at 9:23 AM, Dan Bolser <dan.bolser at gmail.com> wrote:
> The following bug report boils down to this question:
> How should two sequence objects be compared for identity? Does the
> object override 'eq' or implement an 'identical' method?
I think an 'identical' or 'equal' method would be the best alternative
since having a full method call would allow passing arguments like
'-mode => "complete"' to check all sequence features and annotations
if they exist and '-mode => "basic"' to check id() and seq() values.
Bio::Assembly::Contig depends mostly on the last one, although only
id() is tracked most of the time (because of the internal hashes).
> I found the following apparent bug in Contig.pm while executing the
> documented 'SYNOPSIS' code:
[snip]
> It seems to be a bug in the documented behaviour of set_seq_coord:
> "If the sequence was previously added using add_seq, its
> coordinates are changed/set. Otherwise, add_seq is called and the
> sequence is added to the contig."
In fact, it should not print warnings all the time....
> The offending line in that function seems to be:
> if( ... &&
> ($seq ne $self->{'_elem'}{$seqID}{'_seq'}) ) {
> ... <spew warnings>
> }
> $self->add_seq($seq);
> which compares the *passed* sequence object to the sequence string for
> the *stored* sequence object of the same name. This comparison is
> always fails if I understood correctly, therefore set_seq_coord always
> spews warnings if called after add_seq.
Not the sequence string, but the objects themselves, i.e. the string
perl uses to represent Bio::LocatableSeq objects... it is a memory
based version of identical() :)
> Out of curiosity, how come I can't just say:
> my $ls = Bio::LocatableSeq->
> new( -seq => 'ACCG-T',
> -id => 'r1',
> -alphabet => 'dna'
> -start => 3,
> -end => 8,
> -strand => 1
> );
> $c->add_seq( $ls );
Oh, I don't remember but it was either a bad design decision I made 8
years ago to acommodate the Bio::Align::AlignI interface or a problem
with Bio::SeqFeature::Collection at that time. Whatever the case, it
would be nice to change it... you just need to create a
Bio::SeqFeature::Generic when
add_seq is called. I just won't have time to do it myself so feel free to act...
Best,
Robson
> I hope the above report can be of some use.
>
> Sincerely,
> Dan.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list