[Biopython] newbie question: sequence parsing

Nat Echols nathaniel.echols at gmail.com
Tue Oct 18 18:08:03 UTC 2011


Greetings--

We have started using BioPython in our (non-bioinformatics) application and
are investigating the possibility of replacing our existing (custom-made)
sequence parsers.  Two quick questions:

1) Is there a sequence parser that works with just a simple string, without
any header or additional metadata?  If not, how could we write one that
results in the same basic object as those in Bio.SeqIO?  (The parsing is of
course easy, I just want to have the API be consistent regardless of
format.)

2) Is there a single function that will take a file (and/or string) of
unknown format and try the different parsers until it finds one that works?
 We currently use several different formats (raw string, FASTA, PIR, and
possibly others), and we try not to rely on the file extension alone to
determine the type.  We already have something that does this using our
parsers, which could be refactored to use Bio.SeqIO instead, but if
BioPython has something similar I'd rather use that.

thanks,
Nat



More information about the Biopython mailing list