[Bioperl-l] Genbank query problem

Jason Stajich jason.stajich at gmail.com
Sun Sep 30 21:09:30 EDT 2012


Are they organized in the bioprojects at least?

I've been working on something related with dumping of genomes based on what is in bioprojects part of NCBI.

It isn't documented yet since still in dev, but you can try these three scripts. you need to give it a place to write with the (-b) option and you'll want to change the query for the 1st script with the -q option.

- fix the query to the taxon you want bioprojects from:
https://github.com/hyphaltip/mobedac-fungi/blob/master/scripts/download_eutils_bioproject.pl
- then run this to download the sequences
https://github.com/hyphaltip/mobedac-fungi/blob/master/scripts/download_sequences_from_bioproject_staging.pl
- the run this to get the assemblies that for some reason aren't available at nuclids in the bioprojects file but can be gleaned from the genbank file -- maybe not needed for your MT genome project anyways.
https://github.com/hyphaltip/mobedac-fungi/blob/master/scripts/download_sequences_from_bioproject_cleanup_missing.pl

Jason
On Sep 29, 2012, at 10:30 AM, Federico Abascal <fedeabascal at yahoo.es> wrote:

> Dear colleagues,
> 
> I have a script (mitobank.pl) that is used by some people. It is aimed to retrieve mitochondrial genomes for a given taxonomic id. The problem arose when, some months ago, the NCBI reorganized the way genomes are queried and the script no longer worked. I have tried modifying the query string with no success.
> 
> What the script asked for was like:
> 
> 
> my $seq;
> my $gb = new Bio::DB::GenBank;
> my $query = Bio::DB::Query::GenBank->new
> (-query   =>(txid314147[Organism:exp] AND mitochondrial[title] AND genome[ti] NOT plasmid[title] NOT chromosome NOT chloroplast) OR (txid314147[Organism:exp] AND mitochondrion[title] AND genome[ti] NOT plasmid[title] NOT chromosome NOT chloroplast),
>  -db      => 'genome');
> 
> It used to return the list of genomes available for that taxonomic id. However, the NCBI now returns a different kind of results.
> I tried to modify the script and query the "nucleotide" database, but this does not work properly.
> 
> Any one could help me, please?
> 
> Thanks in advance,
> Federico
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org




More information about the Bioperl-l mailing list