[Biopython] biopython question

Tychele Turner tturne18 at jhmi.edu
Wed Apr 4 16:55:20 UTC 2012


Hi,

I have a question regarding one of the biopython capabilities. I would like to trim primers off the end of reads in a fastq file and I found wonderful documentation of how to do this on your website as follows:

from Bio import SeqIO
def trim_primers(records, primer):
    """Removes perfect primer sequences at start of reads.

    This is a generator function, the records argument should
    be a list or iterator returning SeqRecord objects.
    """
    len_primer = len(primer) #cache this for later
    for record in records:
        if record.seq.startswith(primer):
            yield record[len_primer:]
        else:
            yield record

original_reads = SeqIO.parse("SRR020192.fastq", "fastq")
trimmed_reads = trim_primers(original_reads, "GATGACGGTGT")
count = SeqIO.write(trimmed_reads, "trimmed.fastq", "fastq")
print "Saved %i reads" % count




My question is: Is there a way to loop through a primer file for instance instead of looking for only

'GATGACGGTGT' every primer would be checked and subsequently removed from the start of its respective read.

Primer file structured as:
GATGACGGTGT
GATGACGGTGA
GATGACGGCCT

If you have any suggestions it would be greatly appreciated. Thanks.

Tychele





More information about the Biopython mailing list