[Biopython] Which Alphabet type should I use with FASTA files in Biopython?

Peter Cock p.j.a.cock at googlemail.com
Mon Mar 18 10:37:52 UTC 2013


On Mon, Mar 18, 2013 at 1:15 AM, Sameer Farooqui <sameer at blueplastic.com> wrote:
> If I'm using the FASTA files from the link below, what Alphabet type should
> I use in Biopython? Would it be IUPAC.unambiguous_dna?
>
> link to FASTA files:
> http://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/?C=S;O=A
>

RIght now I would suggest using generic_dna, as in:

from Bio.Alphabet import generic_dna

That doesn't give an explicit list of expected letters, unlike the IUPAC
alphabet which does (upper case only). This is an area of Biopython
likely to change in future releases to try to enforce the white-list of
an alphabet against the letters in the sequence being used.

Peter

P.S. Duplicate post here: http://www.biostars.org/p/66687/



More information about the Biopython mailing list