[Bioperl-l] Bio::DB::Query::GenBank checks

Chris Fields cjfields at uiuc.edu
Tue May 16 22:20:29 UTC 2006



> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Bernd Web
> Sent: Tuesday, May 16, 2006 6:38 AM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Bio::DB::Query::GenBank checks
> 
> Hi all,
> 
> I was using Bio::DB::Query::GenBank to obtain only IDs from Entrez and
> found some issues and differences (bugs?) in behaviour wrt the pod.
> Do these look familiar ?
> 
> Some example code:
> my $query = Bio::DB::Query::GenBank->new
>        (-query   =>'Lassa Virus[ORGN]',
>         -reldate => '30',
>         -db      => 'protein',
>         -ids => [195052,2981014,11127914],
>         -maxids => 30 );
> 
> $gb = new Bio::DB::GenBank(format=>'fasta');
> my $seqio = $gb->get_Stream_by_query($query);
> while (my $seq = $seqio->next_seq) {
>        print $seq->desc,"\n"; }
> 
> The module states that if we provide -ids that:
>        If you provide an array reference of IDs in -ids, the query will be
>        ignored and the list of IDs will be used when the query is passed
> to a
>        Bio::DB::GenBank object's get_Stream_by_query() method.
> 
> In the above case actually the query is passed ('Lassa Virus[ORGN]),
> not the IDs. Also $query->query shows the original query. Am I doing
> something wrong or is the pod not reflecting current behaviour of this
> module?
> 
> I was also surprised that if internet is down no warning is thrown for
> $query->query or $query->count at all. Only the get_Stream_by_query
> above will warn us if the site is unreachable (500 Internal Server
> Error).

I believe this has to do with the difference in the objects and the way they
retrieve request data; Bio::DB::GenBank and Bio::DB::Query::GenBank use
different methods to retrieve ids, Bio::DB::GenBank's get_Stream_by_query
method just makes it a bit easier to retrieve a list of uid's directly
instead of saving them as an array then reposting them using
get_Stream_by_id.  Not fullproof but it works okay.

> $query->ids or $query->count will not throw a warning and
> @ids=$query->ids will just be an empty array. (I realize $query->count
> is not initialized, so I am using this now to check for succes, but a
> warning from WebDBSeqI would me more approprotiate I think).

WebDBSeqI would be the place to make general warnings (it supposed to be and
interface for any web seq DB), but not eutils-specific warnings. 

> Last, the example from the pod is not working, but no warnings are raised:
>           # initialize the list yourself
>           my $query =
> Bio::DB::Query::GenBank->new(-ids=>[195052,2981014,11127914]);
> 
> $query->count returns zero w/o any warning. Of course this query did
> not specify a DB. Only if we specify -db=>'nucleotide' $query->count
> is 3.
> However, why not any warning if we set -db->'protein' or if we did not set
> this?
>
>
> On the NCBI website searching Protein DB returns for 19505:
>       See Details. No items found.
>       The following term(s) refer to a different DB:195052
> 
> But this is not reflected via Bio::DB::Query::GenBank.
> 
> Can I check for this situation in the code apart from checking on
> $query->count == 0 ? Or would it indeed be better to check for these
> situations in the module?
> 
> Regards,
> Bernd

I can probably play around with adding a few things in tomorrow and clean up
the POD somewhat.  I'm planning a rewrite for EUtilities-based searches but
that's a ways off still...  Can't promise much;l I'm pretty busy til next
week.

Chris




More information about the Bioperl-l mailing list