[Bioperl-l] Generate a hsp from emboss format?

Peter Rice p.rice at imperial.ac.uk
Tue Mar 19 06:51:52 EDT 2013


On 18/03/2013 12:00, Peter Cock wrote:
> On Mon, Mar 18, 2013 at 11:43 AM, Antony03 <antony.vincent.1 at ulaval.ca> wrote:
>>
>> Hi,
>> I have a script who uses emboss (water) for a dynamic programming alignment.
>> Is this possible to generate (convert) my result in emboss format in a hsp
>> (blast) like.
>>
>> I need (I think) an hsp format because it's much easier for parse only one
>> of the two strand in the alignment (and this is what I need).
>>
>> Thanks!
>
> EMBOSS water (and related tools like needle) support a range of output
> formats - have you looked at those yet?

The full list for the latest EMBOSS release alignment formats is below. 
They are selected by the -aformat qualifier for any application with an 
"align" output.

It would also be relatively simple to add new formats for the next 
release if they would be useful.

Multiple sequence formats:

"fasta",     "Fasta format sequence",
"msf",       "MSF format sequence",
"clustal",   "clustalw format sequence",
"mega",       "Mega format sequence",
"meganon",       "Mega non-interleaved format sequence",
"nexus",   "nexus/paup format sequence",
"nexusnon",   "nexus/paup non-interleaved format sequence",
"phylip",   "phylip format sequence",
"phylipnon", "phylip non-interleaved format sequence",
"selex",       "SELEX format sequence",
"treecon",       "Treecon format sequence",

Multiple alignment formats:

"markx0",    "Pearson MARKX0 format",
"markx1",    "Pearson MARKX1 format",
"markx2",    "Pearson MARKX2 format",
"markx3",    "Pearson MARKX3 format",
"markx10",   "Pearson MARKX10 format",
"match","Start and end of matches between sequence pairs",
"multiple",  "Simple multiple alignment",
"pair",      "Simple pairwise alignment",
"simple",    "Simple multiple alignment",
"sam",       "Sequence alignent/map (SAM) format",
"score",     "Score values for pairs of sequences",
"srs",       "Simple multiple sequence format for SRS",
"srspair",   "Simple pairwise sequence format for SRS",
"tcoffee",   "TCOFFEE program format",

regards,

Peter Rice
EMBOSS Team



More information about the Bioperl-l mailing list