[BioPython] NCBIDictionary and genome database
Michiel Jan Laurens de Hoon
mdehoon at c2b2.columbia.edu
Thu Jan 25 23:43:10 UTC 2007
Hi Tiago,
I updated Biopython in CVS with your code in the places where I think
they are supposed to go. Could you check this new code to make sure it
still works? You would have to download these to files from CVS:
Bio/GenBank/__init__.py (revision 1.65)
Bio/dbdefs/genbank.py (revision 1.6)
With these two files, the following should work:
>>> parser = GenBank.FeatureParser()
>>> ncbi_dict = GenBank.NCBIDictionary('genome', 'genbank', parser=parser)
>>> res = GenBank.search_for('txid8292[orgn]', 'genome')
>>> gb_entry = ncbi_dict[res[0]]
--Michiel.
Tiago Antão wrote:
> Hi,
>
> I am trying to download complete genomes, not nuclear but
> mithocondrial (~17000 bps each).
> For instance:
>
> parser = GenBank.FeatureParser()
> ncbi_dict = GenBank.NCBIDictionary('nucleotide', 'genbank', parser=parser)
> ncbi_dict.db = genome_genbank_eutils
> res = GenBank.search_for('txid8292[orgn]', 'genome')
> gb_entry = ncbi_dict[res[0]]
>
> In this case I am searching_for all amphibian genomes query: txid8292[orgn]
> Or, using the web:
> http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=8292&lvl=0
> And Choose "Genome Sequences" on the right (73):
> http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Genome&cmd=Search&dopt=DocSum&term=txid8292[Organism:exp]
>
>
>
> On 1/25/07, Michiel Jan Laurens de Hoon <mdehoon at c2b2.columbia.edu> wrote:
>> Hi Tiago,
>>
>> Which genbank record are you trying to download?
>> Just so I can replicate the problem and try your workaround.
>>
>> --Michiel
>>
>> Tiago Antão wrote:
>> > Hi!
>> >
>> > Just a question regarding accessing NCBI genome database from
>> NCBIDictionary:
>> > In the code there is:
>> > class NCBIDictionary:
>> > """Access GenBank using a read-only dictionary interface.
>> > """
>> > VALID_DATABASES = ['nucleotide', 'protein']
>> > That is, genome is not a valid one.
>> > Is there a reason for that?
>> >
>> > BTW, I have the following workaround (which might be good or bad...):
>> >
>> > from Bio import GenBank
>> > from Bio.config.DBRegistry import EUtilsDB, DBGroup
>> > from Bio.dbdefs.genbank import ncbi_failures
>> > from Bio import db
>> >
>> > genome_genbank_eutils = EUtilsDB(
>> > name = "genome-genbank-eutils",
>> > doc = "Retrieve genome GenBank sequences from NCBI using
>> EUtils",
>> > delay = 5.0,
>> > db = "genome",
>> > rettype = "gb",
>> > failure_cases = ncbi_failures
>> > )
>> >
>> >
>> > ncbi_dict = GenBank.NCBIDictionary('nucleotide', 'genbank')
>> > ncbi_dict.db = genome_genbank_eutils
>> >
>> > Regards,
>> > Tiago
>>
>>
>> --
>> Michiel de Hoon
>> Center for Computational Biology and Bioinformatics
>> Columbia University
>> 1130 St Nicholas Avenue
>> New York, NY 10032
>>
>
>
--
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1130 St Nicholas Avenue
New York, NY 10032
More information about the Biopython
mailing list