[Biopython] still more questions about NGS sequence trimming

Antonio Fernandez-Guerra afernandez at ceab.csic.es
Thu Oct 25 16:32:42 UTC 2012


-- 
Antonio Fernàndez-Guerra

Center for Advanced Studies of Blanes (CEAB-CSIC)
Acces Cala St Francesc, 14
17300 Blanes, SPAIN
Tel +34 972 33 6101 Fax +34 972 33 7806
http://nodens.ceab.csic.es/ecogenomics/members/antoni-fernandez-guerra.html
e-mail: afernandez at ceab.csic.es

Peter Cock <p.j.a.cock at googlemail.com> wrote:

>On Thu, Oct 25, 2012 at 3:49 PM, Kiss, Csaba <csaba.kiss at lanl.gov> wrote:
>> Thanks, Peter. I am writing my quality functions. Another question about
>> trimming. As you mentioned, the quality of the ends tend to be lower than
>> in the middle. Could that be fixed just by using "sff-trim" when I create my
>> FASTQ file? If I don't do that I get sequences with small and capital letters.
>> Are you suggesting further trimming than just "sff-trim".
>
>In Bio.SeqIO, we use the file format names "sff" and "sff-trim" to mean
>the raw sequence data from the SFF file in full, or with the trimming
>values inside the SFF file applied. If you have used the Roche tools
>you'll see a similar option in their SFF extraction tool. This default
>trimming is decided by the Roche 454 instrument and does quite a
>good job at removing the adapters, barcodes and poor quality bits.
>
>I assume you were using Mothur to do further trimming based on a
>more stringent sliding window of quality scores?
>
>Peter
>_______________________________________________
>Biopython mailing list  -  Biopython at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/biopython




More information about the Biopython mailing list