[Biopython] sff inot fasta and qual then trim

Peter Cock p.j.a.cock at googlemail.com
Tue Oct 23 19:45:19 UTC 2012


On Tue, Oct 23, 2012 at 6:13 PM, Kiss, Csaba <csaba.kiss at lanl.gov> wrote:
>>Could be - disk IO will be a factor, but I suspect the quality trimming to be the slow part rather than the format conversion.
>
> I don't think it's IO or the trimming. Mothur seems to take forever to do the sffinfo process on windows.
> Getting the 3 million sequences out was 3 hours.

That sounds a bit slow, can you compare this to the Biopython SFF
conversion time (or any of the other tools)?

> The trimming took 10 minutes.
> The rest of the python code to fish out my sequences 1 minute.
>
> You see now , why I would like to make it more efficient.
>
> Csaba

Is it possible to fish out your sequences and then do the trimming? If
possible that sounds like it would be more efficient.

Peter



More information about the Biopython mailing list