[Biopython] biopython question
Tychele Turner
tturne18 at jhmi.edu
Wed Apr 4 16:55:20 UTC 2012
Hi,
I have a question regarding one of the biopython capabilities. I would like to trim primers off the end of reads in a fastq file and I found wonderful documentation of how to do this on your website as follows:
from Bio import SeqIO
def trim_primers(records, primer):
"""Removes perfect primer sequences at start of reads.
This is a generator function, the records argument should
be a list or iterator returning SeqRecord objects.
"""
len_primer = len(primer) #cache this for later
for record in records:
if record.seq.startswith(primer):
yield record[len_primer:]
else:
yield record
original_reads = SeqIO.parse("SRR020192.fastq", "fastq")
trimmed_reads = trim_primers(original_reads, "GATGACGGTGT")
count = SeqIO.write(trimmed_reads, "trimmed.fastq", "fastq")
print "Saved %i reads" % count
My question is: Is there a way to loop through a primer file for instance instead of looking for only
'GATGACGGTGT' every primer would be checked and subsequently removed from the start of its respective read.
Primer file structured as:
GATGACGGTGT
GATGACGGTGA
GATGACGGCCT
If you have any suggestions it would be greatly appreciated. Thanks.
Tychele
More information about the Biopython
mailing list