[Biopython] replace header

Peter Cock p.j.a.cock at googlemail.com
Thu May 31 18:10:11 UTC 2012


On Thu, May 31, 2012 at 6:55 PM, Lenna Peterson <arklenna at gmail.com> wrote:
>> The key point about using SeqIO.write(...) once to do a whole
>> file is this requires an iterator based approach. For example,
>> using a generator expression and a function acting on a single
>> record:
>>
>> def modify_record(record):
>>    #Do something sensible to the headers here:
>>    record.id = "modified"
>>    return record
>> #This is a generator expression:
>> modified = (modify_record(r) for r in SeqIO.parse("solid_1.fastq", "fastq"))
>> count = SeqIO.write(modified, "newsolid_1.fastq", "fastq")
>> print "Modified %i records" % count
>>
>> Equivalently using a generator function which does the
>> looping itself:
>>
>> def modify_records(records):
>>    for record in records:
>>        #Do something sensible to the headers here:
>>        record.id = "modified"
>>        yield record
>> count = SeqIO.write(modify_records(SeqIO.parse("solid_1.fastq",
>> "fastq")), "newsolid_1.fastq", "fastq")
>> print "Modified %i records" % count
>
>
> The generator function is nice, too. I presume this only works because
> SeqIO.write knows how to write from an iterator?
>
> Lenna

Bio.SeqIO.write is *designed* to take a Python iterator of SeqRecord
objects. That can be a generator function, generator expression, a
custom class which supports iteration, or even a simple list or tuple
of SeqRecord objects all in memory.

As a special case connivence it also accepts a single SeqRecord.

Peter




More information about the Biopython mailing list