[BioPython] NCBIDictionary and genome database

Michiel Jan Laurens de Hoon mdehoon at c2b2.columbia.edu
Thu Jan 25 23:43:10 UTC 2007


Hi Tiago,

I updated Biopython in CVS with your code in the places where I think 
they are supposed to go. Could you check this new code to make sure it 
still works? You would have to download these to files from CVS:

Bio/GenBank/__init__.py (revision 1.65)
Bio/dbdefs/genbank.py (revision 1.6)

With these two files, the following should work:

 >>> parser = GenBank.FeatureParser()
 >>> ncbi_dict = GenBank.NCBIDictionary('genome', 'genbank', parser=parser)
 >>> res = GenBank.search_for('txid8292[orgn]', 'genome')
 >>> gb_entry = ncbi_dict[res[0]]

--Michiel.

Tiago Antão wrote:
> Hi,
> 
> I am trying to download complete genomes, not nuclear but
> mithocondrial (~17000 bps each).
> For instance:
> 
> parser = GenBank.FeatureParser()
> ncbi_dict = GenBank.NCBIDictionary('nucleotide', 'genbank', parser=parser)
> ncbi_dict.db = genome_genbank_eutils
> res = GenBank.search_for('txid8292[orgn]', 'genome')
> gb_entry = ncbi_dict[res[0]]
> 
> In this case I am searching_for all amphibian genomes query: txid8292[orgn]
> Or, using the web:
> http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=8292&lvl=0
> And Choose "Genome Sequences" on the right (73):
> http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Genome&cmd=Search&dopt=DocSum&term=txid8292[Organism:exp] 
> 
> 
> 
> On 1/25/07, Michiel Jan Laurens de Hoon <mdehoon at c2b2.columbia.edu> wrote:
>> Hi Tiago,
>>
>> Which genbank record are you trying to download?
>> Just so I can replicate the problem and try your workaround.
>>
>> --Michiel
>>
>> Tiago Antão wrote:
>> > Hi!
>> >
>> > Just a question regarding accessing NCBI genome database from 
>> NCBIDictionary:
>> > In the code there is:
>> > class NCBIDictionary:
>> >     """Access GenBank using a read-only dictionary interface.
>> >     """
>> >     VALID_DATABASES = ['nucleotide', 'protein']
>> > That is, genome is not a valid one.
>> > Is there a reason for that?
>> >
>> > BTW, I have the following workaround (which might be good or bad...):
>> >
>> > from Bio import GenBank
>> > from Bio.config.DBRegistry import EUtilsDB, DBGroup
>> > from Bio.dbdefs.genbank import ncbi_failures
>> > from Bio import db
>> >
>> > genome_genbank_eutils = EUtilsDB(
>> >         name = "genome-genbank-eutils",
>> >         doc = "Retrieve genome GenBank sequences from NCBI using 
>> EUtils",
>> >         delay = 5.0,
>> >         db = "genome",
>> >         rettype = "gb",
>> >         failure_cases = ncbi_failures
>> >         )
>> >
>> >
>> > ncbi_dict = GenBank.NCBIDictionary('nucleotide', 'genbank')
>> > ncbi_dict.db = genome_genbank_eutils
>> >
>> > Regards,
>> > Tiago
>>
>>
>> -- 
>> Michiel de Hoon
>> Center for Computational Biology and Bioinformatics
>> Columbia University
>> 1130 St Nicholas Avenue
>> New York, NY 10032
>>
> 
> 


-- 
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1130 St Nicholas Avenue
New York, NY 10032



More information about the Biopython mailing list