[Bioperl-l] get_stream_by_batch with an invalid ac

Mick Watson michaelwatson@paradigm-therapeutics.co.uk
Wed, 07 Aug 2002 13:13:24 +0100


Hi

This isn't a bioperl problem, it's an issue with the NCBI.  All bioperl 
does is provide the ability to query the NCBI, and therefore bioperl's 
behaviour will be exactly the same as the NCBI's

If you go to http://www.ncbi.nlm.nih.gov/ and enter "AC013798 AC013798 
AC021953 ZZ99999" into the query box at the top, and press Go then you 
will see that the query returns no results.  If you remove the ZZ99999 
from the list then the NCBI will return the results.

So Bioperl is simply providing you with exactly what the NCBI does i.e. 
a batch query system that doesn't accept invalid accessions.  It is the 
NCBI who need to improve their query interface

Thanks
Mick

michael wrote:

>
>	As I understand it if I want to fetch a lot of sequences
>by accession number I should use get_stream_by_batch.  However if I
>have an invalid accession number in the 'fetch list' I can't seem to
>access any of the returned data.
>
>	If I run the script below with two (valid) accession numbers
>(['AC013798', 'AC021953']) I get:-
>
>1 gi|7382144|gb|AC013798.4|AC013798
>2 gi|7283183|gb|AC021953.3|AC021953
>Can't call method "display_id" on an undefined value at b.pl line 18,
><GEN1> line 2.
>
>	Which is more or less what I'd expect using (['AC013798',
>'AC021953', 'ZZ99999']) I get:-
>
>Can't call method "display_id" on an undefined value at b.pl line 11.
>
>	Thinking that the null value from the ZZ99999 is at the start of
>the results array (on something) I tried a sacrificial call on display_id
>(by commenting out the print "1 " line) but it fails at the next call
>
>Can't call method "display_id" on an undefined value at b.pl line 14.
>
>	Am I missing something?
>
>
>#!/usr/bin/perl -w
>use strict;
>use Bio::DB::GenBank;
>
>my $gb = new Bio::DB::GenBank;
>my $seqio = $gb->get_Stream_by_acc(['AC013798', 'AC021953', 'ZZ99999']);
>
>my $clone =  $seqio->next_seq;
>print "1 ", $clone->display_id,"\n";
>
>$clone =  $seqio->next_seq;
>print "2 ", $clone->display_id,"\n";
>
>$clone =  $seqio->next_seq;
>print "3 ", $clone->display_id,"\n";
>
>
>
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>Michael John Lush PhD		 	 Tel:44-20-7679-5027
>Nomenclature Bioinformatics Support 	 Fax:44-20-7387-3496
>HUGO Gene Nomenclature Committee Email:  nome@galton.ucl.ac.uk
>The Galton Laboratory
>University College London, UK
>URL: http://www.gene.ucl.ac.uk/nomenclature/
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l@bioperl.org
>http://bioperl.org/mailman/listinfo/bioperl-l
>