[Biopython-dev] Clustal alignment format header line

Cymon Cox cy at cymon.org
Tue May 12 07:07:59 EDT 2009


Both Muscle (-clw) and Probcons (-clustalw)  output a programme specific
header line for the clustal format alignment:

"MUSCLE (3.7) multiple sequence alignment


AK1H_ECOLI/1-378      CPDSINAALICRGEKMSIAIMAGVLEAR etc"

"PROBCONS version 1.12 multiple sequence alignment

AK1H_ECOLI/1-378    CPDSINAALICRGEKMSIAIMA

"

Bio.AlignIO will not read these alignments
Bio/AlignIO/ClustalIO.py:94
 if line[:7] != 'CLUSTAL':
       raise ValueError("Did not find CLUSTAL header")

Muscle does have a -clwstrict flag but ProbCons doesnt.

Would it be a good idea to relax the header parsing?

C.
--


More information about the Biopython-dev mailing list