[Biopython-dev] Merging Bio.SeqIO SFF support?

Peter biopython at maubp.freeserve.co.uk
Tue Mar 2 14:44:13 UTC 2010


On Tue, Mar 2, 2010 at 2:36 PM, Kevin Jacobs <jacobs at bioinformed.com>
<bioinformed at gmail.com> wrote:
> On Tue, Mar 2, 2010 at 8:01 AM, Peter <biopython at maubp.freeserve.co.uk>wrote:
>> Yes, I did wonder about using NumPy here but wanted to ensure that
>> the core of Biopython remains without an external dependency here.
>
> In addition to not creating many little objects, my leanings toward using
> NumPy are also due to the generality of tricks like the following to recode
> quality scores to Sanger ASCII-33 format:
>
>    sffqual  =
> np.array(rec.letter_annotations['phred_quality'],dtype=np.uint8)
>    sffqual += 33
>    sffqual  = sffqual.tostring()
>

Yeah - I had this kind of thing in mind for the qualities, both when
looking at the SFF files and earlier when doing the FASTQ and
QUAL stuff.

You can probably make that more efficient with one line:

sffqual  = (np.array(rec.letter_annotations['phred_quality'],dtype=np.uint8)
+ 33).tostring()

Not sure if it will make a measurable difference mind you ;)

> That said, the alternatives aren't that slow and small integers are shared
> from a pre-allocated pool, so this is not as big a concern.

Indeed.

Peter




More information about the Biopython-dev mailing list