[Biopython] processing genbank file

Sheila the angel from.d.putto at gmail.com
Thu Jun 16 11:43:06 UTC 2011


Hi to all,
>From a genbank file I want to extract certain information. Here is my code

#---------------------------------------------------------------------------------------------------------
from Bio import SeqIO
handle = open('NP_954888.1.gb', "rU")
for gb_record in SeqIO.parse(handle, 'gb'):
 for gb_feature in gb_record.features:
if gb_feature.type == 'CDS':
 gene=gb_feature.qualifiers['gene'][0]
                 db_xref=gb_feature.qualifiers['db_xref']
                                print gene, db_xref

print gb_record.annotations['organism']

#====================================================

Is there any simple way to print information like gene name, GeneID etc. or
I have to use this loop method :( for an example to print organism name I
need to do only gb_record.annotations['organism'] while to print 'gene' id I
need the for loop !!!!
Another problem is the db_xref=gb_feature.qualifiers['db_xref'] gives me
all /db_xref entries in CDS field while I want only /db_xref="GeneID:309165"
(or only the GeneID)...how to do that

Thanks in Advance

--
Cheers
Sheila



More information about the Biopython mailing list