[EMBOSS] how to find unique DNA sequences from a large database
    yun zheng 
    mincloud at gmail.com
       
    Thu Dec  7 20:36:03 UTC 2006
    
    
  
Hi,
Are there any tools for find unique sequences from a large database? Many
thanks.
I need to find unique DNA sequences from a large database. A short piece is
given as follows.
>001
aaaagttgtgtgtgtatgacaggtt
>013
aacctgtcatacacacacaactttt
>289
gttgtgtgtgtatgacaggtt
>375
tgtgtgtatgacaggttgat
>319
tcaacctgtcatacacaca
>177
cgcagtgtgtgtatgacagg
>271
gtcctacctgtcatacacac
>020
aagacataatgtgtgtatgacag
All these seem to be the same sequence, since BLASTN gives very small
e-values for their alignments.
BLASTN 2.2.8 [Jan-05-2004]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.
Query= 001
         (25 letters)
Database: drought-clustered.fa
           410 sequences; 8877 total letters
Searching.done
                                                                 Score    E
Sequences producing significant alignments:                      (bits)
Value
013                                                                    50
8e-11
001                                                                    50
8e-11
289                                                                    42
2e-08
375                                                                    34
5e-06
319                                                                    34
5e-06
177                                                                    32
2e-05
271                                                                    30
8e-05
020                                                                    28
3e-04
Best regards.
sincerely
Zheng, Yun
Department of Computer Science
Washington University in St Louis
Campus Box 1045
1 Brookings Drive, St Louis, MO 63130
    
    
More information about the EMBOSS
mailing list