[EMBOSS] Match mass sequences for mass sequences
    Peter Rice 
    pmr at ebi.ac.uk
       
    Tue Nov 20 08:58:21 UTC 2007
    
    
  
JEEYOUNG LEE wrote:
> Dear Sir
> 
> I'm sorry for a perhaps naive question.
> I want to align sequences of 1000 pairs. For example, "A" file
> includes 1000 sequences and "B" file includes 1000 sequences and two
> file will be compared. I'd like to find certain sequence( X gene) of A
> file which have high sequence similarity with some sequence ( X' gene)
> in B file. Then, certain gene (Y) in "A" file will be matched with Y'
> gene which have high identity in B file. Finally, I want to get
> matched 1000 pairs and their identity score.  At one time, can I match
> mass sequences using Jemboss? How can I handle this problem?
In EMBOSS 5.0.0 the wordfinder program is designed to do this. It uses a 
word-based algorithm (n consecutive identical bases) and then aligns 
using a limited window size. One warning ... the alignment includes the 
original word match, which may (in low identity cases) not be the 
highest alignment score.
Wordfinder has additional options to select the matches you want.
Older EMBOSS releases had only supermatcher which is less sophisticated 
in selecting matches.
Hope that helps
Peter Rice
    
    
More information about the EMBOSS
mailing list