[BioPython] Clustalw.parse_file errors

Nick Matzke matzke at berkeley.edu
Tue Aug 5 20:30:16 UTC 2008


Hi all,

I'm running through the excellent biopython tutorial here:
http://www.biopython.org/DIST/docs/tutorial/Tutorial.html#htoc100

I've hit an error here:
9.4.2  Creating your own substitution matrix from an alignment
http://www.biopython.org/DIST/docs/tutorial/Tutorial.html#htoc100

...basically the Clustalw parser won't parse even the given example 
alignment file (protein.aln) or another example file from elsewhere 
(example.aln).


============
from Bio import Clustalw
from Bio.Alphabet import IUPAC
from Bio.Align import AlignInfo

# get an alignment object from a Clustalw alignment output
c_align = Clustalw.parse_file("protein.aln", IUPAC.protein)
summary_align = AlignInfo.SummaryInfo(c_align)
============
this code doesn't work with the given protein.aln file, error message:

Traceback (most recent call last):
   File "biopython_alignments.py", line 163, in ?
     c_align = Clustalw.parse_file(protein_align_file, IUPAC.protein)
   File 
"/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/Bio/Clustalw/__init__.py", 
line 60, in parse_file
     clustal_alignment._star_info = generic_alignment._star_info
AttributeError: Alignment instance has no attribute '_star_info'


It also doesn't work with the example.aln file here:
http://www.pasteur.fr/recherche/unites/sis/formation/python/ch11s06.html
http://www.pasteur.fr/recherche/unites/sis/formation/python/data/example.aln

...but throws a different error:

code:

c_align = Clustalw.parse_file('example.aln', alphabet=IUPAC.protein)
=================================
Traceback (most recent call last):
   File "biopython_alignments.py", line 174, in ?
     c_align = Clustalw.parse_file('example.aln', alphabet=IUPAC.protein)
   File 
"/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/Bio/Clustalw/__init__.py", 
line 47, in parse_file
     generic_alignment = AlignIO.read(handle, "clustal")
   File 
"/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/Bio/AlignIO/__init__.py", 
line 299, in read
     first = iterator.next()
   File 
"/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/Bio/AlignIO/ClustalIO.py", 
line 169, in next
     raise ValueError("Could not parse line:\n%s" % line)
ValueError: Could not parse line:
                                            *:::**:.**.** *.*** .:* 
*:*******
====================


I am running:
wright:/bioinformatics/pyeg nick$ py -V
Python 2.4.4

...& biopython installed just last week...

Any help appreciated, since I will have to use this module soon!
Nick



-- 
====================================================
Nicholas J. Matzke
Ph.D. student, Graduate Student Researcher
Huelsenbeck Lab
4151 VLSB (Valley Life Sciences Building)
Department of Integrative Biology
University of California, Berkeley

Lab website: http://ib.berkeley.edu/people/lab_detail.php?lab=54
Dept. personal page: 
http://ib.berkeley.edu/people/students/person_detail.php?person=370
Lab personal page: 
http://fisher.berkeley.edu/~edna/lab_test/members/matzke.html
Lab phone: 510-643-6299
Dept. fax: 510-643-6264
Cell phone: 510-301-0179
Email: matzke at berkeley.edu

Office hours for Bio1B, Spring 2008: Biology: Plants, Evolution, Ecology
VLSB 2013, Monday 1-1:30 (some TA there for all hours during work week)

Mailing address:
Department of Integrative Biology
3060 VLSB #3140
Berkeley, CA 94720-3140
====================================================



More information about the Biopython mailing list