[Bioperl-l] Bug in Contig.pm? How to compare two sequence objects?
Dan Bolser
dan.bolser at gmail.com
Sun Jul 25 13:23:35 UTC 2010
Hi all,
The following bug report boils down to this question:
How should two sequence objects be compared for identity? Does the
object override 'eq' or implement an 'identical' method?
I found the following apparent bug in Contig.pm while executing the
documented 'SYNOPSIS' code:
#!/usr/bin/perl -w
use strict;
use Bio::Assembly::Contig;
my $c = Bio::Assembly::Contig->
new( -id => '1' );
my $ls = Bio::LocatableSeq->
new( -seq => 'ACCG-T',
-id => 'r1',
-alphabet => 'dna'
);
my $ls_coord = Bio::SeqFeature::Generic->
new( -start => 3,
-end => 8,
-strand => 1
);
$c->add_seq( $ls );
$c->set_seq_coord( $ls_coord, $ls );
Gives the following WARNINGs:
--------------------- WARNING ---------------------
MSG: Adding sequence r1, which has already been added
---------------------------------------------------
--------------------- WARNING ---------------------
MSG: Replacing one sequence [r1]
---------------------------------------------------
It seems to be a bug in the documented behaviour of set_seq_coord:
"If the sequence was previously added using add_seq, its
coordinates are changed/set. Otherwise, add_seq is called and the
sequence is added to the contig."
The offending line in that function seems to be:
if( ... &&
($seq ne $self->{'_elem'}{$seqID}{'_seq'}) ) {
... <spew warnings>
}
$self->add_seq($seq);
which compares the *passed* sequence object to the sequence string for
the *stored* sequence object of the same name. This comparison is
always fails if I understood correctly, therefore set_seq_coord always
spews warnings if called after add_seq.
Out of curiosity, how come I can't just say:
my $ls = Bio::LocatableSeq->
new( -seq => 'ACCG-T',
-id => 'r1',
-alphabet => 'dna'
-start => 3,
-end => 8,
-strand => 1
);
$c->add_seq( $ls );
I hope the above report can be of some use.
Sincerely,
Dan.
More information about the Bioperl-l
mailing list