[Bioperl-l] Eutilities and no DocSums returned from NCBI assembly database

Fields, Christopher J cjfields at illinois.edu
Mon Dec 10 15:59:03 UTC 2012


Nikki, 

This is b/c a handful of the databases apparently have switched docsum output completely to the DB-specific DocSum schemata (v2), which have not been implemented in Bio::EUtilities as of yet.  This requires quite a bit of revision to parse correctly as it's per database, so I don't have a timeline on when this would be available and would likely be incrementally implemented over time.  

See here for the announcement:

    http://www.ncbi.nlm.nih.gov/books/NBK25499/#chapter4.Release_Notes

In the meantime, you can get the raw XML output for these by replacing the loop for $factory2 with:

    print $factory2->get_Response->content

chris


On Dec 10, 2012, at 2:07 AM, Nikki2 <nikkie.vanbers at gmail.com> wrote:

> Hi,
> 
> I'm using 'Bio::DB::EUtilities' in order to retrieve all the names from
> 'Tracheophyta' that are NCBI's assembly database. However, there are no
> DocSums returned for the uid's that match the query. When I try the same
> thing using the genome database it works fine.
> 
> The script that I used to do the query is at the bottom of this message. The
> output I get when running the script is:
> 
> Count = 84
> 
> --------------------- WARNING ---------------------
> MSG: No returned docsums.
> ---------------------------------------------------
> 
> I checked the @ids array and it contains the 84 uids.
> 
> My questions are as follows:
> 
> 1) Is it possible to get DocSums for uids from the NCBI assembly database,
> and if yes, how?
> 2) If not, does anyone have any suggestions how to change my script to get
> the species-names that match the uids that are returned?
> 
> Thanks a lot!
> 
> Nikki
> 
> 
> 
> 
> 
> 
> 
> ##############################################
> 
> #!/bin/perl -w
> 
> use Bio::DB::EUtilities;
> 
> my $factory = Bio::DB::EUtilities->new(-eutil  => 'esearch',
>                                       -db     => 'genome',
> 				       -email => 'my_email at gmail.com',
>                                       -term   => 'Tracheophyta[organism]',
>                                       -retmax => 5000);
> 
> print "Count = ",$factory->get_count,"\n";
> my @ids = $factory->get_ids;
> 
> my $factory2 = Bio::DB::EUtilities->new(-eutil => 'esummary',
> 					-email=>'my_email at gmail.com',
> 					-db    => 'genome',
>                                        -id    => \@ids,
> 					ret_max=>5000);
> 
> while (my $ds = $factory2->next_DocSum) {
>    print "ID: ",$ds->get_id,"\n";
>    # flattened mode, iterates through all Item objects
>    while (my $item = $ds->next_Item('flattened'))  {
>        # not all Items have content, so need to check...
>        printf("%-20s:%s\n",$item->get_name,$item->get_content) if
> $item->get_content;
>   }
>    print "\n";
> }
> 
> 
> -- 
> View this message in context: http://old.nabble.com/Eutilities-and-no-DocSums-returned-from-NCBI-assembly-database-tp34761946p34761946.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l





More information about the Bioperl-l mailing list