[Bioperl-l] Bio::SimpleAlign

Heikki Lehvaslaiho heikki at sanbi.ac.za
Wed Dec 20 10:25:08 UTC 2006


Kevin,

Sequences that are added to the alignment are supposed to be *aligned*. 
SimpleAlign does not do it for you. It seems to me that you are adding 
sequences like this:

nnnnnnnnnnnnnnnnnnnn  1 - 20, "a short gene" 
nnnnnn               21 - 26 "a short primer after the gene"

when you should be doing this

nnnnnnnnnnnnnnnnnnnn        1 - 20, "a short gene" 
--------------------nnnnnn 21 - 26 "a short primer after the gene"

Note that the default way of displaying names in SimpleAlign 
is "name/start-end". The name usually are expected to refer to the sequence 
from which this subsequence is derived from. The displayname does not change 
if you add gaps.


Yours,
	-Heikki


On Tuesday 19 December 2006 23:46, Kevin Brown wrote:
> I'm working on a script that plays around with alignments of sequences
> and one of the things I noticed is that the code for the match method
> does not seem to actually use the start/end information when creating
> the match between objects in the alignment.  Maybe I'm misunderstanding
> what the alignment is supposed to hold in terms of sequence.  The
> alignment objects I build up are based on the sequence of a gene and the
> sequences of the primers that amplify that gene.
>
> $alignments{$gene->id()}->add_seq(
> 				new Bio::LocatableSeq(
> 				-seq   => $seq[0]->seq(),
> 				-id    => $seq[0]->id(),
> 				-start => $start,
> 				-end => $start + $seq[0]->length() - 1,
> 				-strand => 1
> 			 )
> );

If your sequence does not contain gaps and the numbering starts from one, you 
can let the object handle start/stop:

my $a = new Bio::LocatableSeq(
      -seq   => 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA',
      -id    => 'A00001',
      -strand => 1
}


> $alignments{$gene->id()}->add_seq(
> 				new Bio::LocatableSeq(
> 				-seq   => $seq[1]->seq(),
> 				-id    => $seq[1]->id(),
> 				-start => $stop,
> 				-end => $stop + $seq[1]->length() - 1,
> 				-strand => -1
> 				)
> );
>
> So, you can see I input a start and stop point for the primer, but when
> I use the match function all it does is match the first character of the
> gene sequence to the first char of the primer sequences, then the second
> gene char to the second in each primer, etc...  This doesn't seem to fit
> with the documentation and seems odd that there would be holders for the
> start/stop points and not use them when doing things like matching of
> sequences in an alignment.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________



More information about the Bioperl-l mailing list