[BioPython] Reading Fasta-like Qual Files

David Michael Schruth dschruth at gmail.com
Fri Feb 27 07:26:15 UTC 2009


Hello,

This is my first post on the list.  I'm enjoying using biopython but am
running into some snags when trying to incorporate quality information into
my analysis.  Namely I can't quite read in qual files (output from 454,
solid)  the way I would like.   Namely, the spaces between the two digit
integer phred scores get squished indistinguishably together.   I've
actually fixed this in my own copy of the code by removing the

.replace(" ","")

call from ~58th line of FastaIO.py (in the FastaIterator class).

Hopefully this doesn'thave any adverse effects that I might not have
forseen.  In the mean time, It would be nice to have some sort of more
permanant solution to this.... some way to specify or to otherwise
accomodate these fasta-like qual files in FastaIO and Biopython In
general.    Supporting the fastQ format would also be nice.

The only mention of biopython and quality I've run across is on the
Biopython-dev list:
http://portal.open-bio.org/pipermail/biopython-dev/2007-October/003131.html
The email is dated 2007 but I'm doubting that any progress on this front has
been made.

Perhaps there is a better way to search these lists that I don't know about
but I've sure had a hard time finding any discussions about quality score
handling.   (Which brings up another idea:  why don't we make this list and
the list archive part of google groups?)

Thanks,

Dave



More information about the Biopython mailing list