[Bioperl-l] Questions from a Bioperl beginner?

Sat Jul 10 10:27:29 EDT 2004

Jian-

Welcome to the wonderful world of BioPerl where the documentation is 
thin and the code is complex.  Actually it's not that bad, and you have 
almost cleared all the hurdles to getting your first BioPerl code up and 
running.  After you get a few your first few scripts running, you'll 
find you use code from them over and over and it becomes much smoother.

I'm not sure why your example from the tutorial didn't work.  That 
particular peice of code you tried uses an more basic (and I think 
older) way of retrieving sequence from the database, and it may well be 
broken as there probably aren't very many people using that method 
anymore.  Try the following piece of code which worked fine for me just 
now.  It's more complicated, but it will take you farther in 
understanding how to retrieve sequence the right way, and how to get the 
information stored in that sequence back out so you can use it.

Barry

-------------------------------------------------------------------------------

#!/usr/bin/perl

use strict;
use warnings;
use Bio::SeqIO;
use Bio::DB::GenBank; #use Bio::DB::GenPept or Bio::DB::RefSeq if needed

#Get some sequence IDs either like below, or read in from a file.  Note that
#this sample script works with the accession numbers below (at least at 
the time
#it was written).  If you add different accession numbers, and you get 
errors,
#you may be calling for something that the sequence doesn't have.  
You'll have
#to add your own error trapping code to handle that.
my @ids = ('U59228', 'AB039327', 'BC035972');

#Create the GenBank database object to read from the database.
my $gb = new Bio::DB::GenBank();

#Create a sequence stream to pass the sequences from the database to the 
program.
my $seqio = $gb->get_Stream_by_id(\@ids);

#Loop over all of the sequences that you requested.
while (my $seq = $seqio->next_seq) {

  #Here is how you get methods directly from the RichSeq object.  Replace
  #'display_name' with any other method in Table 2. that can be called on
  #either the RichSeq object directly, or the PrimarySeq object which it has
  #inherited.
  print $seq->display_name,"\n";

  #Here is how to access the classification data from the species object.
  my $species = $seq->species;
  print $species->common_name,"\n";
  my @class = $species->classification;
  print "@class\n";

  #Here is a general way to call things that are stored as a 
Bio::SeqFeature::
  #Generic object.  Replace 'source' with any other of the "major" 
headings in
  #the feature table (e.g gene, CDS, etc.) and replace 'organism' with 
any of
  #the tag values found under that heading (mol_type, locus_tag, gene, etc.)
  my @source_feats = grep { $_->primary_tag eq 'source' } 
$seq->get_SeqFeatures();
  my $source_feat = shift @source_feats;
  my @mol_type = $source_feat->get_tag_values('mol_type');
  print "@mol_type\n";

  #Here is a general way to call things that are stored as some type of a
  #Bio::Annotation oject.  This includes reference information, and 
comments.
  #Replace reference with 'comment' to get the comment, and replace
  #$ref->authors with $ref->title (or location, medline, etc.) to get other
  #reference categories
  my $ann = $seq->annotation();
  my @references = ($ann->get_Annotations('reference'));
  my $ref = shift @references;
  my ($title, $authors, $location, $pubmed, $reference);
  if (defined $ref) {
    $authors = $ref->authors;
    print "$authors\n";
  }
  print "\n";
}

jsun at biologicaltargets.com wrote:

>Dear Sir or Madam;
>  I tried to run some small bioperl program after I successfully installed
>Perl and Bioperl in my computer. While I get some problems and need to
>ask for your kind help. I run a pl file as attached below which I copied
>from bptutorial file:
>**************************************************************
>use Bio::Perl;
>use strict;
>use warnings;
>
>my $seq_object = get_sequence('swissprot',"ROA1_HUMAN");
>
>  # uses the default database - nr in this case
>my $blast_result = blast_sequence($seq_object);
>
>write_blast(">roa1.blast",$blast_result);
>*****************************************************************
>
>Since I didn't make any changes to the source code, it should run fine but
>it failed on my computer. And the error message is:
>
>.....
>Submitted Blast for [ROA1_HUMAN]
>----------------WARNING----------------
>MSG: UNKNOWN
>
><P><!
>QBlastInfoBegin
>--><p><BODY BGCOLOR="#FFFFFF">
><hr><front color="red">ERROR: Results for RID
>1089388321-32330-213160811820 not found</font><hr>
>-----------------------------------
>
>So what's the problem here? and I also tried the
>Bio::Tools::Run::RemoteBlast function, it shows the same error. How can I
>solve this problem? And is there any troubleshooting documents
>that I can use if I get any further problem during my testing?
>
>Your help are the most appreciated.
>Thanks a lot
>Jian Sun
>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>  
>

-- 
Barry Moore
Dept. of Human Genetics
University of Utah
Salt Lake City, UT