[Bioperl-l] Trouble using RemoteBlast.pm

Nagesh nagesh.chakka at anu.edu.au
Wed Jan 18 20:37:28 EST 2006


Thanks very much to all specially to Barry and Hubert for their time in
answering my query. Some updates into my problem.

I have performed some diagnostics tests and writing below my
observations.

First of all, the problem in the code was that it was not waiting for
the results to be ready for writing it to the output file. So I wanted
to check whether the condition "if( !ref($rc) )" is ever satisfied and I
printed out the $rc value which was some thing like "Bio::SearchIO::
blast=HASH(0x9010370)". When I had looked at the Bioperl documentation
for RemoteBlast.pm, the value for $rc in "$rc = $factory->retrieve_blast
($rid);" should either return 0 or 1. I am not able to understand
whether what I am getting is right. 

Secondly, I had manually forced the script to wait between submit_blast,
retrieve_blast and save_output by using sleep with values ranging from
30 to 600. None of them where successful in saving the output. 

When sleep (600) is between submit_blast and retrieve_blast, the
following is printed onto std output (shown below is part of the output)
with output file still empty.

<P><table>
<tr><td>Request ID</td><td> <b>1137626804-16566-100302560340.BLASTQ4</
b></td></tr>
<tr><td>Status</td><td>Searching</td></tr>
<tr><td>Submitted at</td><td>Wed Jan 18 18:26:44 2006</td></tr>
<tr><td>Current time</td><td>Wed Jan 18 18:36:46 2006</td></tr>
<tr><td>Time since submission</td>
<td>00:10:01</td>
</tr><P></table>
<p><hr>This page will be automatically updated in <b>10</b> seconds
until search is done<BR>

When sleep (600) is between retrieve_blast and save_output, the
following is printed with nothing written to output file.

<P><table>
<tr><td>Request ID</td><td> <b>1137632221-28820-85178967709.BLASTQ1</
b></td></tr>
<tr><td>Status</td><td>Searching</td></tr>
<tr><td>Submitted at</td><td>Wed Jan 18 19:57:01 2006</td></tr>
<tr><td>Current time</td><td>Wed Jan 18 19:57:03 2006</td></tr>
<tr><td>Time since submission</td>
<td>00:00:01</td>
</tr><P></table>
<p><hr>This page will be automatically updated in <b>10</b> seconds
until search is done<BR>

Please note the difference in time since submission.

Lastly, I had printed out the request ID and manually paused the script
by using <STDIN> between submit_blast and retrieve_blast. The idea was
to check the status of the job online through the NCBI website. When the
results where ready, I made the script to proceed further and was able
to save the desired results to the file. I am puzzled with this
observation as I am not understanding why manually formating the results
online helps in getting the results.
I am basically a molecular biologist and trying hard to solve this
computational stuff, so there might be some trivial issues according to
you computer wiz :)

Barry suggested me to use perl debugger which I will try to use.

Thanks for your attention.

Below is the code which was being tested. 

########################################################################

use strict;
use warnings;
use Bio::Tools::Run::RemoteBlast;

print "$Bio::Root::Version::VERSION\n";
my $prog = 'blastp';
my $db   = 'swissprot';
my $e_val= '1e-10';

my @params = ( '-prog' => $prog,
       '-data' => $db,
       '-expect' => $e_val,
       '-readmethod' => 'SearchIO' );

my $factory = Bio::Tools::Run::RemoteBlast->new(@params);

#change a paramter
$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo sapiens
[ORGN]';

#remove a parameter
delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};

my $v = 1;
#$v is just to turn on and off the messages

my $r = $factory->submit_blast('blastInput.txt');

print STDERR "waiting..." if( $v > 0 );
while ( my @rids = $factory->each_rid ) 
{
        foreach my $rid ( @rids ) 
        {
    
    print "RID $rid\n";

    #<STDIN>;
    #sleep 600;
    my $rc = $factory->retrieve_blast($rid);
    
    print "RC $rc\n";
                if( !ref($rc) ) 
                {
                        if( $rc < 0 ) 
                        {
    				$factory->remove_rid($rid);
                        }
                        print STDERR "." if ( $v > 0 );
                        sleep 5;
    } 
                else 
                {
    sleep 600;
    $factory->save_output('temp.out');
    my $checkinput = $factory->file;
                    open(my $fh,"<$checkinput") or die $!;
                    while(<$fh>)
{
                             print;
                        }
                         close $fh;
    $factory->remove_rid($rid);
                }
        }
}

########################################################################


On Tue, 2006-01-17 at 16:03 -0700, Barry Moore wrote:
> Nagesh,
> 
> Attached is an input file, script and output.  These work for me, and I
> think they are the same that you are using.  Have a look and see if you
> can find any differences that might be causing you problem.  Other than
> that I don't know what to tell you.  If you are familiar with the perl
> debugger you (and if you're not, now's probably a good time to become
> familiar with it) you should step through you script and be sure that
> all of you're objects are getting defined when they are supposed to be.
> That can often help narrow down the problem.
> 
> Barry
> 
> > -----Original Message-----
> > From: Nagesh Chakka [mailto:nagesh.chakka at anu.edu.au]
> > Sent: Tuesday, January 17, 2006 1:57 PM
> > To: Barry Moore
> > Cc: Hubert Prielinger; bioperl-l at bioperl.org
> > Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
> > 
> > Bi  Barry,
> > With the help of Hubert, I further modified the script but still have
> the
> > same
> > problem. The problem is that from the point of submitting the blast
> query,
> > the script does not wait until the blast results are ready  for
> retrieval
> > and
> > event of submission is immediately followed by retrieving and saving
> the
> > output. Since the results will not be ready (about a sec) this fast,
> the
> > output created is blank. I am able to retrieve the results online
> using
> > the
> > RID which I am making the script to print.
> > So  my main problem is making the program to wait after submitting the
> > result.
> > My input file has a single fasta sequence which I have pasted below.
> > Its interesting to note that the script works on your system. Is it
> > creating
> > an output file with the blast report?
> > Thanks very much for your attention.
> > Regards
> > Nagesh
> > 
> > blastInput.txt
> > >MusDpl
> >
> MKNRLGTWWVAILCMLLASHLSTVKARGIKHRFKWNRKVLPSSGGQITEARVAENRPGAFIKQGRKLDIDFG
> AE
> > GNRYYA
> >
> ANYWQFPDGIYYEGCSEANVTKEMLVTSCVNATQAANQAEFSREKQDSKLHQRVLWRLIKEICSAKHCDFWL
> ER
> > GAAL
> > RVAVDQPAMVCLLGFVWFIVK
> > 
> > On Wednesday 18 January 2006 05:34, Barry Moore wrote:
> > > Nagesh-
> > >
> > > Did you get this figured out?  Your script works as is on my system.
> > > You say temp.out is empty?  What does you input sequence
> > > (blastInput.txt) look like?
> > >
> > > Barry
> > >
> > > > -----Original Message-----
> > > > From: bioperl-l-bounces at portal.open-bio.org [mailto:bioperl-l-
> > > > bounces at portal.open-bio.org] On Behalf Of Hubert Prielinger
> > > > Sent: Monday, January 16, 2006 2:54 PM
> > > > To: Nagesh Chakka; bioperl-l at portal.open-bio.org
> > > > Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
> > > >
> > > > Nagesh Chakka wrote:
> > > > >Hi All,
> > > > >I was trying to setup a system to perform a remote blast on
> regular
> > > >
> > > > basis. I
> > > >
> > > > >thought this could be best achieved by using BioPerl module and
> came
> > > >
> > > > across
> > > >
> > > > >RemoteBlast.pm
> > > > >I had modified the sample script "bp_remote_blast.pl" which takes
> a
> > >
> > > file
> > >
> > > > >containing single FASTA sequence as an input. Also I wanted the
> blast
> > > >
> > > > report
> > > >
> > > > >to be saved in a file for latter use and
> > > > >modified the code as follows
> > > > >I am using the latest version of Bioperl (1.5) on a Fedora
> platform.
> > > >
> > >
> >#######################################################################
> > > >
> > > > >print "$Bio::Root::Version::VERSION\n";
> > > > >use Bio::Tools::Run::RemoteBlast;
> > > > >use strict;
> > > > >my $prog = 'blastp';
> > > > >my $db   = 'swissprot';
> > > > >my $e_val= '1e-10';
> > > > >
> > > > >my @params = ( '-prog' => $prog,
> > > > >       '-data' => $db,
> > > > >       '-expect' => $e_val,
> > > > >       '-readmethod' => 'SearchIO' );
> > > > >
> > > > >my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> > > > >
> > > > >#change a paramter
> > > > >$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo
> sapiens
> > > > >[ORGN]';
> > > > >
> > > > >#remove a parameter
> > > > >delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
> > > > >
> > > > >my $v = 1;
> > > > >#$v is just to turn on and off the messages
> > > > >
> > > > >my $r = $factory->submit_blast('blastInput.txt');
> > > > >
> > > > >print STDERR "waiting..." if( $v > 0 );
> > > > >while ( my @rids = $factory->each_rid )
> > > > >{
> > > > >        foreach my $rid ( @rids )
> > > > >        {
> > > > >                my $rc = $factory->retrieve_blast($rid);
> > > > >                if( !ref($rc) )
> > > > >                {
> > > > >                        if( $rc < 0 )
> > > > >                        {
> > > > >                                $factory->remove_rid($rid);
> > > > >                        }
> > > > >                        print STDERR "." if ( $v > 0 );
> > > > >                        sleep 5;
> > > > >                }
> > > > >                else
> > > > >                {
> > > > >                        print "RID $rid\n";
> > > > >                        $factory->save_output('temp.out');
> > > > >                        $factory->remove_rid($rid);
> > > > >                }
> > > > >        }
> > > > >}
> > > >
> > >
> >#######################################################################
> > >
> > > ##
> > >
> > > > ########
> > > >
> > > > >This script prints the RID and terminates immediately. Obviously
> the
> > > > >output file created is empty as the program did not wait for
> getting
> > >
> > > the
> > >
> > > > >blast results from the RID.
> > > > >Is there something I am doing wrong and what can I do for the
> program
> > >
> > > to
> > >
> > > > wait
> > > >
> > > > >until the results are ready to be printed to the output file. I
> could
> > >
> > > not
> > >
> > > > get
> > > >
> > > > >much information from the documentation and have no prior
> experience
> > >
> > > with
> > >
> > > > >Bioperl.
> > > > >Thanks very much for  your attention.
> > > > >Regards
> > > > >Nageshbi
> > > > >_______________________________________________
> > > > >Bioperl-l mailing list
> > > > >Bioperl-l at portal.open-bio.org
> > > > >http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > > >
> > > > hi nagesh,
> > > > try this, should work, I had the same problem:
> > > >
> > > > .......................
> > > > .......................
> > > >
> > > > else
> > > >                 {
> > > >                         print "RID $rid\n";
> > > >                         $factory->save_output('temp.out');
> > > >
> > > > 			my $checkinput = $factory->file;
> > > >               		open(my $fh,"<$checkinput") or die $!;
> > > >               		while(<$fh>){
> > > >                 		print;
> > > >               		}
> > > >               		close $fh;
> > > >
> > > >
> > > > 			$factory->remove_rid($rid);
> > > >                 }
> > > >         }
> > > > }
> > > >
> > > > regards
> > > > Hubert
> > > >
> > > > PS: are you using the composition based statistics parameter with
> your
> > > > blast search?
> > > > if yes, is it working?
> > > >
> > > > _______________________________________________
> > > > Bioperl-l mailing list
> > > > Bioperl-l at portal.open-bio.org
> > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l



More information about the Bioperl-l mailing list