[Biopython] sff inot fasta and qual then trim
Peter Cock
p.j.a.cock at googlemail.com
Tue Oct 23 12:14:00 EDT 2012
On Tue, Oct 23, 2012 at 5:04 PM, Kiss, Csaba <csaba.kiss at lanl.gov> wrote:
> I am new to bio-python. I am trying to replace mothur with BioPython.
> I hope that biopython is faster than mothur. All I want to do is this:
>
> sffinfo(sff=sd11.fasta)
> trim.seqs(fasta=sd11.fasta, qfile=sd11.qual, minlength = 50, maxhomop=8, qwindowsize=50, qwindowaverage =22)
>
> Can someone help me to translate the two mothur statements
> above to biopython, please?
> It would be greatly appreciated.
> thanks
I don't know enough about mothur to give you an informed answer.
I would guess the first line is just SFF to FASTA and QUAL, based
partly on the title to your email. That at least is trivial in Biopython:
from Bio import SeqIO
SeqIO.convert("example.sff", "sff", "example.fasta", "fasta")
SeqIO.convert("example.sff", "sff", "example.qual", "qual")
Or, if you want the trimming in the SFF file applied, which is
generally sensible:
from Bio import SeqIO
SeqIO.convert("example.sff", "sff-trim", "example.fasta", "fasta")
SeqIO.convert("example.sff", "sff-trim", "example.qual", "qual")
Personally I prefer to work with a single FASTQ file rather than
a paired FASTA+QUAL (it is smaller on disc for one thing), so
maybe:
from Bio import SeqIO
SeqIO.convert("example.sff", "sff-trim", "example.fastq", "fastq")
Regards,
Peter
More information about the Biopython
mailing list