[Bioperl-l] different results with remote-blast skript

Smithies, Russell Russell.Smithies at agresearch.co.nz
Sun Jul 5 17:00:16 EDT 2009


I'd guess it's a difference in the parameters used.
Interesting that both have the number of letters in the db as "-1,125,070,205", I assume that's a bug  :-)

Stats from your remote_blast:

'stats' => {
             'S1' => '42',
             'S1_bits' => '20.8',
             'lambda' => '0.309',
             'entropy' => '0.345',
             'kappa_gapped' => '0.0410',
             'T' => '11',
             'kappa' => '0.122',
             'X3_bits' => '24.7',
             'X1' => '16',
             'lambda_gapped' => '0.267',
             'X2' => '38',
             'S2' => '74',
             'seqs_better_than_cutoff' => '0',
             'posted_date' => 'Jul 4, 2009  4:41 AM',
             'Hits_to_DB' => '60102303',
             'dbletters' => '-1125070205',
             'A' => '40',
             'num_successful_extensions' => '2004',
             'num_extensions' => '1436892',
             'X1_bits' => '7.1',
             'X3' => '64',
             'entropy_gapped' => '0.140',
             'dbentries' => '9252258',
             'X2_bits' => '14.6',
             'S2_bits' => '33.1'
           }


Stats from a blast done on the NCBI webpage:

  Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
    Posted date:  Jul 4, 2009  4:41 AM
  Number of letters in database: -1,125,070,205
  Number of sequences in database:  9,252,258

Lambda     K      H
   0.309    0.124    0.340 
Gapped
Lambda     K      H
   0.267   0.0410    0.140 
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 9252258
Number of Hits to DB: 86493230
Number of extensions: 3101413
Number of successful extensions: 9001
Number of sequences better than 100: 65
Number of HSP's better than 100 without gapping: 0
Number of HSP's gapped: 9000
Number of HSP's successfully gapped: 66
Length of query: 150
Length of database: 3169897087
Length adjustment: 113
Effective length of query: 37
Effective length of database: 2124391933
Effective search space: 78602501521
Effective search space used: 78602501521
T: 11
A: 40
X1: 16 (7.1 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (20.8 bits)
S2: 65 (29.6 bits)


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Jonas Schaer
> Sent: Sunday, 28 June 2009 10:15 p.m.
> To: BioPerl List
> Subject: [Bioperl-l] different results with remote-blast skript
> 
> Hi again :)
> please, I only have this little question:
> why do I get different results with my remote::blast perl skript then on the
> ncbi blast homepage?
> I am using blastp, the query is an amino-sequence (different results with any
> sequence, differences not only in number of hits but even in e-values, scores
> etc...), the database is 'nr'.
> PLEASE help me,
> thank you in advance,
> Jonas
> 
> ps: my skript:
> ##############################################################################
> ##
> use Bio::Seq::SeqFactory;
>   use Bio::Tools::Run::RemoteBlast;
>   use strict;
>   my @blast_report;
>   my $prog = 'blastp';
>   my $db   = 'nr';
>   my $e_val= '1e-10';
>   #my $e_val= '10';
>   my @params = ( '-prog' => $prog,
>          '-data' => $db,
>          '-expect' => $e_val,
>          '-readmethod' => 'SearchIO' );
>   my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
>    $Bio::Tools::Run::RemoteBlast::HEADER{'GAPCOSTS'} = '11 1';
>    $Bio::Tools::Run::RemoteBlast::HEADER{'MAX_NUM_SEQ'} = '100';
>  $Bio::Tools::Run::RemoteBlast::HEADER{'EXPECT'} = '10';
> $Bio::Tools::Run::RemoteBlast::HEADER{'COMPOSITION_BASED_STATISTICS'} = '1';
> 
>   my
> $blast_seq='MGSSSVGTYHLLLVLMGAGGEQQAVQAGAEVASTEQVDGSGMAANSRGSTSGSEQPPRDSDLGLLR
> SLLDVAGVDRTALEVKLLALAEAGAEMPPAQDSQATAAGVVATLTSVYRQQVARAWHERDDNAFRQAHQNTAMATGPD
> PDDEYE';
>   #$v is just to turn on and off the messages
>   my $v = 1;
>   my $seqbuilder = Bio::Seq::SeqFactory->new('-type' => 'Bio::PrimarySeq');
>   my $seq = $seqbuilder->create(-seq =>$blast_seq, -display_id =>
> "$blast_seq");
>   my $filename='temp2.out';
>   my $r = $factory->submit_blast($seq);
>   print STDERR "waiting..." if( $v > 0 );
>     while ( my @rids = $factory->each_rid )
>     {
>         foreach my $rid ( @rids )
>         {
>             my $rc = $factory->retrieve_blast($rid);
>             if( !ref($rc) )
>             {
>                 if( $rc < 0 )
>                 {
>                     $factory->remove_rid($rid);
>                 }
>                 print STDERR "." if ( $v > 0 );
>             }
>                 else
>                 {
>                     my $result = $rc->next_result();
>                     $factory->save_output($filename);
>                     $factory->remove_rid($rid);
>                     print "\nQuery Name: ", $result->query_name(), "\n";
>                     while ( my $hit = $result->next_hit )
>                     {
>                         next unless ( $v > 0);
>                         print "\thit name is ", $hit->name, "\n";
>                         while( my $hsp = $hit->next_hsp )
>                         {
>                             print "\t\tscore is ", $hsp->score, "\n";
>                         }
>                     }
>                 }
>         }
> 
> 
>     }
> @blast_report = get_file_data ($filename);
> return @blast_report;
> ##############################################################################
> ####
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================



More information about the Bioperl-l mailing list