[Bioperl-l] Bio::Tools::pSW stop codon bug?

Prachi Shah prachi at stanford.edu
Wed Aug 9 00:18:08 UTC 2006


Hi,

I am trying to align very similar protein sequences with the
Bio::Tools::pSW modules but running into an issue which seems like a
bug. One of the two sequences is extended considerably with gaps so
that an Amino acid residue matches the stop codon (*). I know there
should not be any internal stop codons but we are working with a new
assembly of the candida genome and we want to pick out such
inconsistent cases. In any case, the alignment should match the two
sequences (because they are the same) up until the stop codon is
encountered in the new sequence. Instead it artificially extends the
old sequence and matches the Alanine with the stop codon. Any help on
this is appreciated.

Thanks
Prachi

 Here is an example set of two sequences I am trying to align:

>orf19.6264.3
MSNYLNLAQFSGVTDRFNLERIKSDFSSVQSTISKLRPPQEFFDFRRLSKPANFGEIQQRVGYNLGYFSANYITIVLGLSIYALITNFLLLFVTIFVLGGIYGINKLNGEDLVLPVGRFNTSQLYTGLLIVAVPLGFLASPISTMMWLIGSSGVTVGAHAALMEKPIETVFEEEV*V
>orf19.6264.3_old
MSNYLNLAQFSGVTDRFNLERIKSDFSSVQSTISKLRPPQEFFDFRRLSKPANFGEIQQRVGYNLGYFSANYITIVLGLSIYALITNFLLLFVTIFVLGGIYGINKLNGEDLVLPVGRFNTSQLYTGLLIVAVPLGFLASPISTMMWLIGSSGVTVGAHAALMEKPIETVFEEEV

and below is the part of code that generates the alignments --

################
my $new_translatedSeqObj = Bio::Seq->new(-display_id => $gene,
								     -seq        => $new_translatedSeq);

my $old_translatedSeqObj = Bio::Seq->new(-display_id => $gene. "_old",
								     -seq        => $old_translatedSeq);
			
# do alignments
my $align_factory = new Bio::Tools::pSW( '-matrix' =>
'/tools/perl/5.8.8/lib/site_perl/5.8.8/Bio/Ext/Align/blosum62.bla',
								     '-gap' => 12,
								     '-ext' => 2
								   );

my $aln = $align_factory->pairwise_alignment( $old_translatedSeqObj,
$new_translatedSeqObj );

my $alnout = new Bio::AlignIO(-format => 'clustalw',
							  -fh     => \*STDOUT);
##################

The alignment --

CLUSTAL W(1.81) multiple sequence alignment


orf19.6264.3_old/1-162
MSNYLNLAQFSGVTDRFNLERIKSDFSSVQSTISKLRPPQEFFDFRRLSKPANFGEIQQR
orf19.6264.3/1-177
MSNYLNLAQFSGVTDRFNLERIKSDFSSVQSTISKLRPPQEFFDFRRLSKPANFGEIQQR

************************************************************


orf19.6264.3_old/1-162
VGYNLGYFSANYITIVLGLSIYALITNFLLLFVTIFVLGGIYGINKLNGEDLVLPVGRFN
orf19.6264.3/1-177
VGYNLGYFSANYITIVLGLSIYALITNFLLLFVTIFVLGGIYGINKLNGEDLVLPVGRFN

************************************************************


orf19.6264.3_old/1-162 TSQLYTGLLIVAVPLGFLASPISTMMWLIGSSGVTVGAHA---------------AL
orf19.6264.3/1-177     TSQLYTGLLIVAVPLGFLASPISTMMWLIGSSGVTVGAHAALMEKPIETVFEEEV*V
                       ****************************************                :



More information about the Bioperl-l mailing list