[Biopython-dev] Creating a NCBIFastaIterator
p.j.a.cock at googlemail.com
Fri Oct 7 16:00:52 UTC 2011
On Fri, Oct 7, 2011 at 4:38 PM, Andrew Sczesnak
<andrew.sczesnak at med.nyu.edu> wrote:
> Adding my unsolicited opinion here, what do y'all think of this NCBIFasta
> parser being a more general "callback" parser, where a function passed to
> read() or write() translates some arbitrary delimited-text into ...
> This would be similar to key_function in SeqIO.to_dict() and would shift the
> responsibility of handling variation in formats to the user. Alternatively,
> a few functions to parse different styles of description lines could be
> included in the module.
Interesting idea, although it doesn't fit that well with the current
(deliberately) simple high level Bio.SeqIO.parse/read API,
that doesn't mean we can't do it (see Bio.Phylo.parse).
In this case I fail to see what benefit this gives over the current
situation, where the user can do this themselves with the
current FASTA parser,
e.g. With a function and a generator expression,
records = (do_ncbi_my_way(record) for record in SeqIO.parse(filename, "fasta"))
or more simply within a loop:
for record in SeqIO.parse(filename, "fasta")):
#Do stuff with record
Maybe it is down to personal preference of coding style?
I would much prefer a new "fasta-ncbi" parser in SeqIO
that handled all the documented NCBI FASTA identifiers.
I'm being negative here - but please don't let that deter you
from posting ideas. This is a public list and we/I welcome
constructive criticism and alternative ideas to the table.
More information about the Biopython-dev