[BioPython] Sorry, one more time: extract data from a large .gbk file

Hans Meier biopyte at yahoo.de
Mon Jan 2 11:33:49 EST 2006


Dear friends,
 
 I apologize for bothering you once more with this.
 But maybe we can make it now clear.
 All I want to do is extract data from a whole genome .gbk file on my disk. 
 The file has about 5000(!) entries like the one shown below.
 All I want to do is:
 
 Give me the  protein sequence (="/translation) (or whatever)
 of gene (="/gene") soandso. 
 
 Speed matters.
 
 Though I believe I'm not a total dummy in programming
 and I tried several approaches taken from the web
 I could not program this so that the request is finished
 within a reasonable time or without crushing my box 
 (P3,700MHz,256MB)
 
 Since this is an important question for me but 
 I don't want to bother you with this any further,
 maybe someone could just post a code snippet
 how to accomplish this trivial(?) task?
 
 
 Thanks a lot for all your work and your help, Harald
 
 
 ###### a typical .gbk entry ###########
  gene            94650..96008
                      /gene="murF"
                      /locus_tag="b0086"
                      /note="synonyms: mra, EG10622, b0086"
                      /db_xref="GeneID:944813"
 CDS             94650..96008
                      /gene="murF"
                      /locus_tag="b0086"
                      /EC_number="6.3.2.15"
                      /function="enzyme; Murein sacculus, peptidoglycan"
                      /note="go_component: cytoplasm [goid 0005737];
                      go_process: peptidoglycan biosynthesis [goid 0009252];
                      go_process: peptidoglycan metabolism [goid 0000270]"
                      /codon_start=1
                      /transl_table=11
                      /product="D-alanine:D-alanine-adding enzyme"
                      /protein_id="NP_414628.1"
                      /db_xref="ASAP:313"
                      /db_xref="GI:16128079"
                      /db_xref="GeneID:944813"
                      /translation="MISVTLSQLTDILNGELQGADITLDAVTTDTRKLTPGCLFVALK
 GERFDAHDFADQAKAGGAGALLVSRPLDIDLPQLIVKDTRLAFGELAAWVRQQVPARV
 VALTGSSGKTSVKEMTAAILSQCGNTLYTAGNLNNDIGVPMTLLRLTPEYDYAVIELG             ANHQGEIAWTVSLTRPEAALVNNLAAAHLEGFGSLAGVAKAKGEIFSGLPENGIAIMN  ADNNDWLNWQSVIGSRKVWRFSPNAANSDFTATNIHVTSHGTEFTLQTPTGSVDVLLP LPGRHNIANALAAAALSMSVGATLDAIKAGLANLKAVPGRLFPIQLAENQLLLDDSYN
 ANVGSMTAAVQVLAEMPGYRVLVVGDMAELGAESEACHVQVGEAAKAAGIDRVLSVGK      QSHAISTASGVGEHFADKTALITRLKLLIAEQQVITILVKGSRSAAMEEVVRALQENG
 TC"
 ########end of the example#####################
 

		
---------------------------------
Telefonieren Sie ohne weitere Kosten mit Ihren Freunden von PC zu PC!
Jetzt Yahoo! Messenger installieren!


More information about the BioPython mailing list