[Bioperl-l] accessing EMBL database

Sandipan Chowdhury sandipan.chowdhury at physiology.wisc.edu
Thu Nov 19 06:49:45 UTC 2009


Hi,
 
I have 3 questions all related to the retreival of sequences from online databases.
 
(1) I have been trying to download a protein sequence from the EMBL database and trying to write the sequence into a text file, as a string. I am using the following code: 
 
use Bio::DB::EMBL;
open b,">","s.txt";
$em_obj = Bio::DB::EMBL->new;
  $seq_obj = $em_obj->get_Seq_by_acc("CAB95729");
  $s_str = $seq_obj->seq;
  print b "$s_str\n";
close b;
 
The script is not working and gives the messege:
"MSG: EMBL stream with no ID. Not embl in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
STACK: trial2.pl"
 
I am not sure what this means. A similar version of the script works for the Swissprot, GenBank and RefSeq databases but not for the EMBL. What is the way around this so that I can download the embl sequence?
 
(2) Also, is there anyway I can download sequences from DDBJ (database of Japan)?
 
(3) Can GI numbers be used to retreive the sequences? If so then how?
 
Answers to these questions would be greatly appreciated. I am very new to Perl/Bioperl and am not really familiar with the advanced programming features, so I would need to your help to find my way out of this situation.
 
Many Thanks
Sandipan
 




More information about the Bioperl-l mailing list