[Biopython] sff inot fasta and qual then trim

Peter Cock p.j.a.cock at googlemail.com
Tue Oct 23 16:14:00 UTC 2012


On Tue, Oct 23, 2012 at 5:04 PM, Kiss, Csaba <csaba.kiss at lanl.gov> wrote:
> I am new to bio-python. I am trying to replace mothur with BioPython.
> I hope that biopython is faster than mothur. All I want to do is this:
>
> sffinfo(sff=sd11.fasta)
> trim.seqs(fasta=sd11.fasta, qfile=sd11.qual, minlength = 50, maxhomop=8, qwindowsize=50, qwindowaverage =22)
>
> Can someone help me to translate the two mothur statements
> above to biopython, please?
> It would be greatly appreciated.
> thanks

I don't know enough about mothur to give you an informed answer.

I would guess the first line is just SFF to FASTA and QUAL, based
partly on the title to your email. That at least is trivial in Biopython:

from Bio import SeqIO
SeqIO.convert("example.sff", "sff", "example.fasta", "fasta")
SeqIO.convert("example.sff", "sff", "example.qual", "qual")

Or, if you want the trimming in the SFF file applied, which is
generally sensible:

from Bio import SeqIO
SeqIO.convert("example.sff", "sff-trim", "example.fasta", "fasta")
SeqIO.convert("example.sff", "sff-trim", "example.qual", "qual")

Personally I prefer to work with a single FASTQ file rather than
a paired FASTA+QUAL (it is smaller on disc for one thing), so
maybe:

from Bio import SeqIO
SeqIO.convert("example.sff", "sff-trim", "example.fastq", "fastq")

Regards,

Peter



More information about the Biopython mailing list