[BioPython] EMBOSS programs and their alignment formats

Peter (BioPython) biopython at maubp.freeserve.co.uk
Tue Mar 21 12:30:16 UTC 2006


I've been having a look at BioPython's Emboss support and it looks like 
a (partial) set of command line interfaces to the tools, with additional 
code for some of the primer tools and their formats.

As far as I can tell, there is no support for any of the Emboss 
alignment output formats:

http://emboss.sourceforge.net/docs/themes/AlignFormats.html

Some (all?) of the alignment programs will happily produce gapped FASTA 
output, but this excludes other information like the alignment score 
etc.  The alignments themselves could be analysed to extract the 
alignment length, identity, similarity and gap counts.

However, the FASTA format does not include the algorithm specific score, 
nor other program parameters which might be of interest (like the matrix 
and gap penalties).

e.g.

########################################
# Program:  demoalign
# Rundate:  Thu Jan 17 09:30:08 2002
# Report_file: stdout
########################################
#=======================================
#
# Aligned_sequences: 4
# 1: IXI_234
# 2: IXI_235
# 3: IXI_236
# 4: IXI_237
# Matrix: EBLOSUM62
# Gap_penalty: 9
# Extend_penalty: -1
#
# Length: 131
# Identity:      95/131 (72.5%)
# Similarity:   127/131 (96.9%)
# Gaps:          25/131 (19.1%)
#
#
#=======================================

(followed by the aligned sequences)

Has anyone tackled supporting these files in BioPython?

Thanks

Peter




More information about the Biopython mailing list