[Biopython-dev] Bio.Sequencing

Peter Cock p.j.a.cock at googlemail.com
Mon Jun 29 03:23:06 EDT 2009


On Sun, Jun 28, 2009 at 3:10 PM, Cymon Cox<cymon.cox at googlemail.com> wrote:
> Hi Peter,
>
> What is the long-term future of Bio.Sequencing? With the (very cool)
> QualityIO stuff now in SeqIO, the Phd module looks a bit out of place - is
> there any reason not to move both Ace and Phd code to SeqIO ie
> in the AceIO and PhdIO interfaces?

In the case of FASTQ and QUAL files, everything gets stored in
the SeqRecord, so I didn't see any reason to have something in
Bio.Sequencing (although perhaps things like mapping between
the PHRED and Solexa scores could live there, along with the
basic parser used internally giving string tuples - does this sound
worth doing?).

As you know, currently the SeqIO "ace" and "phd" are simply built
on top of Bio.Sequencing.Ace and Bio.Sequencing.PhD, and only
transforms a subset of the data into a SeqRecord object. This also
describes the SwissProt parsing now - the general model is we have
a SeqRecord interface (which may not cover all the details), and an
underlying more file format specific objects used to hold the data.

> I ask because Ive written a Phd writer class for the SeqIO interface
> and initially added it to PhdIO.

Do you want to file an enhancement bug, and then either upload
the code to bugzilla, or give a link to a github branch to we can
have a look?

If your writer takes SeqRecord objects, then I think it would make
sense to go in Bio.SeqIO.PhdIO (as I have done for GenBank,
although this is in part because I have some intentions to simplify
the Bio.GenBank code, and having another writer with a another
API in there would make this more complicated).

It would also make sense to have a writer in Bio.Sequencing.Phd
taking its Record objects (and have Bio.SeqIO turn SeqRecord
objects into PhD Record objects, and call that). Perhaps this would
be a better idea as it is more flexible, but it would be more work,
and could be slower ;)

Peter


More information about the Biopython-dev mailing list