[Biopython] losing information

Liam Thompson dejmail at gmail.com
Thu Oct 29 04:53:32 UTC 2009


hi everyone

I'm running a simple script to remove genbank records from a GB file
that I have indentified as undesirable. The only
problem is that when the script is run, all the annotation info (CDS
etc) for entries is lost, only the sequence and ID is kept.
I was wondering if there is an option I am missing, or if I am using
an incorrect variable type somewhere. I just
can't seem to get all the info written.

from Bio import SeqIO

outhandle = open("HBV_seqs.gb", "w")
inhandle = open("all_hbv_seqs_reannotated.gb", "rU")
newrecords = []
badlist = list(open("deletionrecords.txt", "rU"))
badrecord=[]

for items in badlist:
    badrecord.append(items[:-1])

for record in SeqIO.parse(inhandle, "genbank"):
    if record.name not in badrecord:
            newrecords.append(record)

print "writing records..."
SeqIO.write(newrecords, outhandle, "genbank")
print "writing done"
outhandle.close()


I would appreciate any pointers.

Thanks
Liam



More information about the Biopython mailing list