[BioPython] ambiguous alphabets and alignments

Karin Lagesen karin.lagesen at medisin.uio.no
Fri Nov 30 11:57:04 UTC 2007




Hello.

I have used biopython on and off, and found it very good. I have now
however encountered an odd problem which I hope you can help me with.

I am working with alignments, and I do this:


>>> from Bio import Clustalw
>>> from Bio.Align import AlignInfo
>>> from Bio.Alphabet import IUPAC
>>> alignment = Clustalw.parse_file("align16S/AE000511_16S.aln", alphabet=IUPAC.IUPACAmbiguousDNA)
>>> summary_aln = AlignInfo.SummaryInfo(alignment)
>>> pssm = summary_aln.pos_specific_score_matrix()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/local/python/lib/python2.4/site-packages/Bio/Align/AlignInfo.py", line 368, in pos_specific_score_matri
x
  File "/usr/local/python/lib/python2.4/site-packages/Bio/Align/AlignInfo.py", line 111, in dumb_consensus
  File "/usr/local/python/lib/python2.4/site-packages/Bio/Align/AlignInfo.py", line 203, in _guess_consensus_alphabe
t
ValueError: Could not determine the type of alphabet.
>>>

Now, to test what alphabet I am dealing with I use code from SummaryInfo:

>>> from Bio import Alphabet
>>> from Bio.Alphabet import IUPAC
>>> from Bio.Seq import Seq
>>> isinstance(summary_aln.alignment._records[0].seq.alphabet.alphabet, Alphabet.DNAAlphabet)
False
>>> summary_aln.alignment._records[0].seq.alphabet.alphabet
<class Bio.Alphabet.IUPAC.IUPACAmbiguousDNA at 0x101d5360>
>>> 

However, when I check the Alphabet class:

class IUPACAmbiguousDNA(Alphabet.DNAAlphabet):
    letters = IUPACData.ambiguous_dna_letters

it seems like the alphabet I load the alignment with is an extension
of DNAAlphabet, however, the isinstance still fails.

I am pretty sure that this is somehow a misunderstanding on my side,
but I cannot figure this one out.

Thankyou for your help!

Karin
-- 
Karin Lagesen, PhD student
karin.lagesen at medisin.uio.no
http://folk.uio.no/karinlag



More information about the Biopython mailing list