[Bioperl-l] Problem retrieving peptides using GenPept

Brian Osborne brian_osborne at cognia.com
Fri Oct 10 08:54:36 EDT 2003


Palle,

This is not a problem with your code or your connection, it's the "bond"
operator in the entry you're trying to retrieve. The Genbank and GenPept
parsers were written to respect the formal specifications for the
Genbank/EMBL/DDBJ feature tables (see
http://www.ncbi.nlm.nih.gov/projects/collab/FT/ for the details). The bond
operator is not in the standard, yet Genbank, for some reason, has allowed
it to appear so Bioperl is complaining. One possible work-around is to use
eval{} around the appropriate lines in your get_query().

Brian O.

-----Original Message-----
From: bioperl-l-bounces at portal.open-bio.org
[mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of Palle Villesen
Sent: Friday, October 10, 2003 5:10 AM
To: bioperl-l at bioperl.org
Subject: [Bioperl-l] Problem retrieving peptides using GenPept

Hi,

I have the following code (below), but I get the following error and I'm
stuck. I have tried various queries and I always get <300 seqs back
before the exception occurs. I'm not sure if it a speed problem, i.e.
retrieving the datastream quickjly enough, but I have tried the
-retrievaltype=>'tempfile' as well - which didn't help.

I couldn't find anything in the docs, hence this mail. Sorry to disturb.

Greetings,
P.
--

   Palle Villesen, Ph.D.
   BiRC, Build. 090, University of Aarhus
   DK - 8000 Aarhus C, Denmark

   Http://www.daimi.au.dk/~biopv - +45 61708600
---------------------------------------------------------------------



OUTPUT:

Getting accession number for the following query:
BSE

Hits: 118
1 Seq length 176
2 Seq length 253
3 Seq length 176
4 Seq length 178
5 Seq length 96
6 Seq length 179
7 Seq length 179
8 Seq length 253
9 Seq length 360

-------------------- WARNING ---------------------
MSG: exception while parsing location line [bond(179,214)] in reading
EMBL/GenBank/SwissProt, ignoring feature Bond (seqid=Q60506):
------------- EXCEPTION  -------------
MSG: operator "bond" unrecognized by parser
STACK Bio::Factory::FTLocationFactory::from_string
/home/serine/palle/bioperl-live/Bio/Factory/FTLocationFactory.pm:160
STACK (eval) /home/serine/palle/bioperl-live/Bio/SeqIO/FTHelper.pm:124
STACK Bio::SeqIO::FTHelper::_generic_seqfeature
/home/serine/palle/bioperl-live/Bio/SeqIO/FTHelper.pm:123
STACK Bio::SeqIO::genbank::next_seq
/home/serine/palle/bioperl-live/Bio/SeqIO/genbank.pm:396
STACK main::get_query get_RVproteins_from_GB.pl:30
STACK (eval) get_RVproteins_from_GB.pl:8
STACK toplevel get_RVproteins_from_GB.pl:7

--------------------------------------

---------------------------------------------------
Caught exception
Done


Program:

#!/usr/local/bin/perl
my $query_string = "BSE";

print "Getting the following query:$query_string\n";

eval {
  get_query($query_string);
};

if ($@) {
  print "Caught exception\n";
}
print "Done\n\n";
exit();

sub get_query {
  use Bio::DB::GenPept;
  use Bio::DB::Query::GenBank;
  my ($query_string) = @_;
  my $query = Bio::DB::Query::GenBank->new(-db=>'protein',
                                           -query=>$query_string
                                          );
  my $gb = Bio::DB::GenPept->new ();
  print "Hits: ".$query->count."\n";
  my @ids = $query->ids;
  my $current =1;
  my $seqio = $gb->get_Stream_by_query($query);
  my $current =1;
  while (my $seq = $seqio->next_seq()) {
    print $current++." Seq length ".$seq->length()."\n";
  }
return 1;
}




_______________________________________________
Bioperl-l mailing list
Bioperl-l at portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list