[EMBOSS] seqret options
    Derek Gatherer 
    d.gatherer at vir.gla.ac.uk
       
    Wed Jun 15 10:31:33 UTC 2005
    
    
  
Dear EMBOSSers
I'm trying to write a pipeline to take a load of paired, aligned homologues 
from 2 species and submit them sequentially to the yn00 application from 
the well known PAML package.  PAML's applications all take PHYLIP 
format.  I can easily make this by looping over:
seqret -auto -osformat phylip infile -out outfile
However, PAML requires that the flag "I" be placed on the top line of the 
phylip fomat to indicate interleaved, eg:
  2 663 I
c-barf1  ATGGCCAGGC TTTTCGCTCA GCTGCTCCTG CTCGCGGGCT CCGTCGCCTC
barf1     ATGGCCAGGT TCATCGCTCA GCTCCTCCTG TTGGCCTCCT GTGTGGCCGC
           CTGCCTGGCC GTCACCGCCT TTGTGGGTGA GCGGGCCGTC CTGAGTTCCT
           CGGCCAGGCT GTCACCGCTT TCTTGGGTGA GCGAGTCACC CTGACCTCCT
rather than the standard phylip format, given by seqret:
  2 663
c-barf1   ATGGCCAGGC TTTTCGCTCA GCTGCTCCTG CTCGCGGGCT CCGTCGCCTC
barf1     ATGGCCAGGT TCATCGCTCA GCTCCTCCTG TTGGCCTCCT GTGTGGCCGC
           CTGCCTGGCC GTCACCGCCT TTGTGGGTGA GCGGGCCGTC CTGAGTTCCT
           CGGCCAGGCT GTCACCGCTT TCTTGGGTGA GCGAGTCACC CTGACCTCCT
I could write a script to open each seqret output file and add this 
character to the top line of each, but before I dive into this, I'd like to 
know if there is any flag I can add to seqret to get the "I" added 
automatically.
Failing that, PAML takes the other, non-interleaved phylip format 
("sequential") by default, and that would not require any flag 
insertion.  Seqret also can produce this (using -osformat phylip3):
1 663 YF
c-barf1 ATGGCCAGGC TTTTCGCTCA GCTGCTCCTG CTCGCGGGCT CCGTCGCCTC
           CTGCCTGGCC GTCACCGCCT TTGTGGGTGA GCGGGCCGTC CTGAGTTCCT
           ACTGGAAGAG GGTGAGCCTA GGGCCCGAGA TCATGGTGGA ATGGTTCAAA
but then PAML won't read it because it doesn't like the YF flags inserted 
by seqret!!
So I either have to script to remove flags from sequential or insert them 
in interleaved, unless seqret has a solution.
All assistance gratefully appreciated
Derek
    
    
More information about the EMBOSS
mailing list