[BioPython] GenBank records

Jeffrey Chang jchang at smi.stanford.edu
Tue Feb 25 16:25:28 EST 2003


Yes, Rob Knight just reported this bug.  The default NCBIDictionary 
only works for nucleotide sequences.  For protein sequences, you'd need 
to set some parameters.  Try:
nucleotide_ncbi_dict = GenBank.NCBIDictionary()
protein_ncbi_dict = GenBank.NCBIDictionary(database="protein", 
format="gp")

And use the protein_ncbi_dict to retrieve protein sequences.

Jeff



On Tuesday, February 25, 2003, at 12:22 PM, JINLING HUANG wrote:

> Hi, everyone,
>
> I am trying to retrieve GenBank records of protein sequences with an
> old script I wrote before. The script is like the following:
>
> from Bio import GenBank
> import sys
>
> file = sys.argv[1]           #file of gi
> fp1 = open(file, 'r+')
> ids = fp1.read()
>
> lids = ids.split('\n')
> recNum = len(lids)
>
> ncbi_dict = GenBank.NCBIDictionary()
>
> for i in range(0, recNum):
>     gb_record = ncbi_dict[lids[i]]
>     print gb_record
>
>
> The script works well for records of nucleotide sequences, but does not
> work for records of protein sequence.  It constantly gives an error
> message:
>
> Traceback (most recent call last):
>   File "getGBRecords.py", line 24, in ?
>     gb_record = ncbi_dict[lids[i]]
>   File 
> "/bio/python2.2/lib/python2.2/site-packages/Bio/GenBank/__init__.py", 
> line
> 1535, in __getitem__ raise KeyError, x
> KeyError: ERROR, possibly because id not available?
>
>
> Does anyone know why?  The script was written last summer, do I need to
> update my biopython to utilize some new features?
>
> Thanks and best wishes,
>
> Jinling
>
> _______________________________________________
> BioPython mailing list  -  BioPython at biopython.org
> http://biopython.org/mailman/listinfo/biopython



More information about the BioPython mailing list