[Bioperl-l] RemoteBlast problem

Jason Stajich jason.stajich at duke.edu
Fri Oct 28 16:22:08 EDT 2005


Fixes to handle both changes in BLAST report format and CGI-server at  
NCBI.  All this and more in bioperl 1.5.1

You can install it locally without messing with your fink (leave 1.4  
installed)

Just download bioperl 1.5.1 from bioperl - uncompress and move to  
somewhere ( like $HOME/src/bioperl-1.5.1 or whatever you want) and  
updated your PERL5LIB variable and prepend this path to it:

So add this in your .bash_profile or .bashrc depending on how you are  
setup

export PERL5LIB=$HOME/src/bioperl-1.5.1:$PERL5LIB

(after the test -r /sw/bin/init.sh && . /sw/bin/init.sh line)

When fink supports 1.5.1 you can just remove this line and you'll be  
back to normal.

-jason

On Oct 28, 2005, at 3:55 PM, Thomas J Keller wrote:

> Greetings,
> I'm using perl 5.8.6 and the fink installation of bioperl 1.4.5 on  
> an Apple G5 running OS X 10.4.2
> running my script with a simple fasta dna sequence file:
> $ bp_remote_blast2.pl -p blastx -d nr -i test.fa
>
> Here's the error:
>
> ------------- EXCEPTION  -------------
> MSG: no data for midline Query  78    
> HRRPSFSACRCVLSASSVFPSRLGNNYITAAGAQVLAEGLRGNTSLQFLG  227
> STACK Bio::SearchIO::blast::next_result /sw/lib/perl5/5.8.6/Bio/ 
> SearchIO/blast.pm:1151
> STACK toplevel /Users/kellert/Sandbox/Perlscripts/ 
> bp_remote_blast2.pl:90
>
>
> It looks like Bio::SearchIO::blast
> is choking on the result from Bio::Tools::Run::RemoteBlast
>
> I lifted this from Jason's exampleL bp_remote_blast but modified it  
> to use SearchIO method instead of BPLite:
>
> use warnings;
> use strict;
> use vars qw($USAGE);
>
> use Bio::Tools::Run::RemoteBlast;
> use Bio::SeqIO;
> use Bio::SearchIO;
> use Getopt::Long;
>
> $USAGE = "remote_blast.pl [-h] [-p prog] [-d db] [-e expect]  [-f  
> seqformat] -i seqfile\n";
>
> my ($prog, $db, $expect );
>
> my ($sequencefile,$sequenceformat,$help) = (undef, 'fasta',undef);
>
> &GetOptions('prog|p=s'               => \$prog,
>         'db|d=s'                 => \$db,
>         'expect|e=s'             => \$expect,
>         'input|i=s'              => \$sequencefile,
>         'format|f=s'             => \$sequenceformat,
>         'help|h'                 => \$help,
>         );
>
> if( $help ) {
>     exec('perldoc', $0);
>     die;
> }
>
> if( !defined $prog ) {
>     die($USAGE . "\n\tMust specify a valid program name ([t]blast 
> [pxn])\n");
> }
> if( !defined $db ) {
>     die($USAGE . "\n\tMust specify a db (e.g. \'nr\') to search\n");
> }
> if( !defined $sequencefile ) {
>     die($USAGE . "\n\tMust specify a path to a sequence file.\n");
> }
>
> my $factory = new Bio::Tools::Run::RemoteBlast ('-prog'      => $prog,
>                              '-data'      => $db,
>                              '-expect'    => $expect,
>                              'readmethod' => 'SearchIO',         
> #use SearchIO to parse
>                              );
>
> # submit_blast can only currenly handle fasta format files so I'll
> # preprocess outside of the module but I'd rather be sure here
>
> my $input;
> if( $sequenceformat !~ /fasta/ ) {
>     my @seqs;
>     my $seqio = new Bio::SeqIO('-format' => $sequenceformat,
>                    '-file'   => $sequencefile );
>     while( my $seq = $seqio->next_seq() ) {
>     push @seqs, $seq;
>     }
>     $input = \@seqs;
> } else {
>     $input = $sequencefile;
> }
> my $r = $factory->submit_blast($input);
>
> my $v = 1;
> ## set to 0 to turn off Sanity check messages
> print STDERR  "retrieving blasts for $input ...\n" if( $v > 0);
>
> while ( my @rids = $factory->each_rid ) {
>     foreach my $rid ( @rids ) {
>         my $rc = $factory->retrieve_blast($rid);
>         if( !ref($rc) ) {
>             if( $rc < 0 ) {
>                 $factory->remove_rid($rid);
>             }
>             print STDERR "." if ( $v > 0 );
>             sleep 5;
>             } else {
>                 my $result = $rc->next_result();
>                 next unless ($result);
>                  print "\nQuery Name: ", $result->query_name(), "\n";
>                 #save the output
>                 my $filename = $result->query_name()."\.out";
>                 print STDERR  "Saving result to $filename.\n" if 
> ( $v > 0);
>                 $factory->save_output($filename);
>                 $factory->remove_rid($rid);
>                  print "\nQuery Name: ", $result->query_name(), "\n";
>                 while ( my $hit = $result->next_hit ) {
>                     next unless ( $v > 0);
>                             print "\thit name is ", $hit->name, "\n";
>                             while( my $hsp = $hit->next_hsp ) {
>                                   print "\t\tscore is ", $hsp- 
> >score, "\n";
>                               }
>                         }
>                   }
>             }
>           }
>
> I'm guessing either NCBI has changed it blast format and bioperl  
> 1.4 no longer works, or I'm missing something that should be obvious.
>
> Help much apprecieated.
>
> Tom K
>
>
> Thomas J. Keller, Ph.D.
> Director, MMI Core Facility
> Oregon Health & Science University
> 3181 SW Sam Jackson Park Rd.
> Portland, OR, USA,   97239
>
> http://www.ohsu.edu/research/core
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12




More information about the Bioperl-l mailing list