[Bioperl-l] Bug in Contig.pm? How to compare two sequence objects?

Dan Bolser dan.bolser at gmail.com
Sun Jul 25 09:23:35 EDT 2010


Hi all,

The following bug report boils down to this question:

How should two sequence objects be compared for identity? Does the
object override 'eq' or implement an 'identical' method?


I found the following apparent bug in Contig.pm while executing the
documented 'SYNOPSIS' code:

#!/usr/bin/perl -w

use strict;
use Bio::Assembly::Contig;

my $c = Bio::Assembly::Contig->
  new( -id       => '1' );

my $ls = Bio::LocatableSeq->
  new( -seq      => 'ACCG-T',
       -id       => 'r1',
       -alphabet => 'dna'
     );

my $ls_coord = Bio::SeqFeature::Generic->
  new( -start    => 3,
       -end      => 8,
       -strand   => 1
     );

$c->add_seq( $ls );
$c->set_seq_coord( $ls_coord, $ls );


Gives the following WARNINGs:

--------------------- WARNING ---------------------
MSG: Adding sequence r1, which has already been added
---------------------------------------------------

--------------------- WARNING ---------------------
MSG: Replacing one sequence [r1]

---------------------------------------------------



It seems to be a bug in the documented behaviour of set_seq_coord:

        "If the sequence was previously added using add_seq, its
coordinates are changed/set.  Otherwise, add_seq is called and the
sequence is added to the contig."


The offending line in that function seems to be:

  if( ... &&
      ($seq ne $self->{'_elem'}{$seqID}{'_seq'}) ) {
          ... <spew warnings>
  }
  $self->add_seq($seq);


which compares the *passed* sequence object to the sequence string for
the *stored* sequence object of the same name. This comparison is
always fails if I understood correctly, therefore set_seq_coord always
spews warnings if called after add_seq.


Out of curiosity, how come I can't just say:

my $ls = Bio::LocatableSeq->
  new( -seq      => 'ACCG-T',
       -id       => 'r1',
       -alphabet => 'dna'
       -start    => 3,
       -end      => 8,
       -strand   => 1
     );

$c->add_seq( $ls );


I hope the above report can be of some use.

Sincerely,
Dan.



More information about the Bioperl-l mailing list