[BioPython] Clustalw.parse_file errors

Peter biopython at maubp.freeserve.co.uk
Tue Aug 5 21:08:45 UTC 2008


On Tue, Aug 5, 2008 at 9:30 PM, Nick Matzke <matzke at berkeley.edu> wrote:
> Hi all,
>
> I'm running through the excellent biopython tutorial here:
> http://www.biopython.org/DIST/docs/tutorial/Tutorial.html#htoc100

I'm glad you are enjoying the Tutorial (apart from the parsing bug!).
I can't take any credit for this bit ;)

Seeing as you are trying to use the SummaryInfo class, I should
mention that in Biopython 1.47 this doesn't work very well with
generic alphabets.  In some cases this means you have to supply some
of the optional arguments like the characters to ignore (e.g. "-")
which might otherwise be inferred from the alphabet.  There have been
some changes in CVS to try and address this.

http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/Align/AlignInfo.py?cvsroot=biopython

> I've hit an error here:
> 9.4.2  Creating your own substitution matrix from an alignment
> http://www.biopython.org/DIST/docs/tutorial/Tutorial.html#htoc100
>
> ...basically the Clustalw parser won't parse even the given example
> alignment file (protein.aln) or another example file from elsewhere
> (example.aln).

The good news is I've just checked protein.aln on my machine, and it
can be parsed fine.  This is using the CVS version of Biopython, but
probably just updating the file .../Bio/AlignIO/ClustalIO.py as I
suggested in my earlier email will fix this too.

I've realised that our unit tests didn't include the example file
protein.aln, otherwise we would have caught this earlier (when I made
ClustalW parsing change).  Its a bit late now, but I have just added
protein.aln to the alignment parsing unit test for future validation.

Peter



More information about the Biopython mailing list