[Biopython] Error with new biopython with pos_specific_score_matrix()

Peter Cock p.j.a.cock at googlemail.com
Fri May 22 10:12:00 UTC 2020


You should avoid setting private attributes, the AlignIO function accepts
and alphabet argument:

from Bio import AlignIO
from Bio.Alphabet import ProteinAlphabet
align = AlignIO.read('samples/cas9align.fasta', 'fasta')
align._alphabet = ProteinAlphabet()

becomes:

from Bio import AlignIO
from Bio.Alphabet import ProteinAlphabet
align = AlignIO.read('samples/cas9align.fasta', 'fasta',
alphabet=ProteinAlphabet())

Also I would suggest using the generic protein instance as per the tutorial
examples, which could make the planned alphabet simplification/removal this
year easier:

from Bio import AlignIO
from Bio.Alphabet import generic_protein
align = AlignIO.read('samples/cas9align.fasta', 'fasta',
alphabet=generic_protein)

If that alone does not solve it, could you share the test FASTA file or
reproduce this with one of the examples included with Biopython?

Regards,

Peter

On Thu, May 21, 2020 at 11:02 PM Sebastian Bassi <sbassi at gmail.com> wrote:

> I have a code that used to work I think in 1.52 or so.
> This is the code I am testing now in 1.76:
>
> from Bio import AlignIO
> from Bio.Align.AlignInfo import SummaryInfo
> from Bio.Alphabet import ProteinAlphabet
>
> align = AlignIO.read('samples/cas9align.fasta', 'fasta')
> align._alphabet = ProteinAlphabet()
> summary = SummaryInfo(align)
> summary.pos_specific_score_matrix()
>
> I get
>
> /usr/local/lib/python3.6/dist-packages/Bio/Align/AlignInfo.py <https://localhost:8080/#> in _guess_consensus_alphabet(self, ambiguous)    196             if not isinstance(alt, a.__class__):    197                 raise ValueError("Alignment contains a sequence with \
> --> 198                                 an incompatible alphabet.")    199     200         # Check the ambiguous character we are going to use in the consensus
>
>
> ValueError: Alignment contains a sequence wit an incompatible alphabet.
>
>
>
> The alignment alphabet is protein, so I don't know what is wrong with it.
> This is the alignment file I am using:
> https://github.com/Serulab/Py4Bio/blob/master/samples/cas9align.fasta
>
> I will appreciate any help.
>
> Best,
> SB
>
>
>
> _______________________________________________
> Biopython mailing list  -  Biopython at mailman.open-bio.org
> https://mailman.open-bio.org/mailman/listinfo/biopython
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20200522/3cbb598f/attachment.htm>


More information about the Biopython mailing list