[Biopython-dev] [Bug 2382] Generic FASTA parser
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Tue Oct 16 22:17:28 EDT 2007
http://bugzilla.open-bio.org/show_bug.cgi?id=2382
------- Comment #9 from mdehoon at ims.u-tokyo.ac.jp 2007-10-16 22:17 EST -------
If all these special fasta files are coming from Roche Diagnostics, I'd suggest
to create a module rather than trying to put this in Bio.SeqIO. Bio.SeqIO is
one of the few modules in Biopython that is used by most users, so I'd like to
keep it clean as much as possible. To avoid confusion for users who just want
to parse regular Fasta files, I think the module should not be called
Bio.Fasta. In addition, I doubt we'd get much code reuse from a generic
Bio.Fasta module beyond what is needed for the Roche files, since the only
thing they have in common is that they use ">" to separate records.
With a separate module to handle the Roche files, my preferred usage would be
something like this:
from Bio import SeqIO, GSFlex # Or whatever you'd like to call it
seqrecords = SeqIO.parse(open("mysequences.fa"), "fasta")
qualities = GSFlex.parse(open("myqualities.qual"), "quality")
for seqrecord, quality in zip(seqrecords, qualities):
seqrecord.quality = quality
Note that "quality" is currently not a field of the SeqRecord class, but with
SeqRecord being a Python class, we can just add fields on the fly.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list