[Bioperl-l] alignment of 2 sequences/ FASTA

Jason Stajich jason@cgt.mc.duke.edu
Tue, 26 Feb 2002 15:17:01 -0500 (EST)


Hmm there are lots of ways to do this.  It depends on
a) how many sequences you plan to do this for
b) whether you always know the 2 sequences you want to align
c) are you doing nucleotide to nucleotide (DNA vs DNA or cDNA vs DNA)
d) how accurate do you need your alignments to be?

Per a)
 - you should use a heuristic algorithm like BLAST [1] or FASTA [2] if you
need to search thousands or hundreds of thousands.
 - if not use a Smith-Waterman implementation - ssearch [2],EMBOSS water[3]

b) you can use bl2seq, FASTA, SSEARCH, or water to do this

c) sim4 or est2genome or genewise or exonerate here
d) Don't use a heuristic alg (like BLAST or FASTA)

1. BLAST - ftp://ftp.ncbi.nih.gov/blast/executables/ (Altschul et al)
2. FASTA - http://fasta.bioch.virginia.edu/ (Pearson)
3. EMBOSS - http://www.emboss.org (Rice et al)

We support parsing of FASTA (including ssearch I believe) and BLAST with
Bio::SearchIO, water with Bio::AlignIO (format "emboss"), bl2seq with
Bio::Tools::BPlite, sim4 and est2genome (as a gene prediction means) with
Bio::Tools::Sim4::Result.  We don't have an est2genome parser right now.

Some of those modules are not in the 0.7 series but will be in 1.0 and are
in the 0.9.x dev series.

-jason
On Tue, 26 Feb 2002, Simon Chan wrote:

> Hi,
>
> I would like to align 2 sequences.  Kind of hard to explain what I need
> so I'll use an example:
>
>
> seq 1: abab
> seq 2: nnnnababnnnnn
>
> How can I get the script to output that the match between
> seq 1 and seq2 starts at position 5 and and ends at position 8?
>
> I was told to run the FASTA program on the seq1 and seq2 and that
> should get what I want, however, there doesn't seem to be a module
> that will perform FASTA comparisons...?
>
> Thanks, Everybody.
>
> simon
>
> ###################################
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>

-- 
Jason Stajich
Duke University
jason@cgt.mc.duke.edu