[Biopython] changing record attributes while iterating
Peter Cock
p.j.a.cock at googlemail.com
Tue Oct 4 08:24:08 UTC 2011
On Tue, Oct 4, 2011 at 9:05 AM, Bala subramanian
<bala.biophysics at gmail.com> wrote:
> Friends,
> I have a fasta file. I need to modify the record id by adding a suffix to
> it. So i used SeqRecord (the code attached below). It is working fine but i
> would like to know if there is any simple way to do that. ie. if i can
> change the record attributes while iterating through the fasta with
> SeqIO.parse itself. I tried something like following but i couldnt get what
> i wanted.
>
> new_list=[]
> for record in SeqIO.parse(open(argv[1], "rU"), "fasta"):
> record.id=record.id + '_suffix'
> new_list.append(record)
The above looks fine, although depending on the rest of your script
a big list might be a bad idea (too much memory) and an iterator
based approach may be preferable. If as in the rest of your example
you just need to do this for output, perhaps:
#!/usr/bin/env python
from Bio import SeqIO
from sys import argv
def rename(record):
"""Modified record in place AND returns it."""
record.id += '_suffix'
return record
#This is a generator expression:
records = (rename(r) for r in SeqIO.parse(argv[1], "fasta"))
output_filename = raw_input('Enter the output file:')
SeqIO.write(records, output_filename, "fasta")
The alternative you showed was wasteful, creating lots of new
objects to no benefit.
Peter
More information about the Biopython
mailing list