[Bioperl-l] zebrafish est database

Jason Stajich jason@cgt.mc.duke.edu
Tue, 4 Jun 2002 11:23:32 -0400 (EDT)

We're just using the NCBI web interface

So we have set it up to easily blast the dbs listed on that page, but it
takes a little more digging to incorporate a genome db that is not there.
After I did some digging at the Dr blast site I found a couple of
variables you'll need to set to tell blast to use the correct db.

Looking at the form HTML from the Dr blast site
you can find some special variables which tell the CGI to use the Dr
specific dbs:
<input type="hidden" name="DB_DIR_PREFIX" value="dr_genome">
        <input type="hidden" name="DATABASE" value="dr_mrna">

        <input type="hidden" name="CMD" value="PUT">
        <input type="hidden" name="AUTO_FORMAT" value="on">
	<input type="hidden" name="WWW_BLAST_TYPE" value="dr_genes">
so the DB_DIR_PREFIX needs to go in and I'm going guess from the above
that the est db name is dr_est (mRNA was the selected value from the

<rant> Isn't reverse engineering fun - wouldn't this all be easier if
there was a nice programmable API to easily submit blast jobs rather than
having to deduce from HTML.... In the same breath I also would suggest
being a good citizen and downloading the est db and blasting on your own
hardware if you are doing this on a regular basis - NCBI isn't the world's
BLAST farm.  If you've got the machines in your lab, do it locally
whenever possible.  NCBI provides blast software for most operating
systems and we've tried to provide wrappers in bioperl to make this easier
(albeit it may not work ideally on windows just yet - but I'm sure someone
will step up to the plate and rework the code). </rant>

So here is updated code to let you blast against Dr.  The >>> lines are
the ones that are special to the Dr blast otherwise this should look very
familiar to the SYNOPSIS of the RemoteBlast code.

#!/usr/bin/perl -w
use Bio::Tools::Run::RemoteBlast;
use strict;
my $v = 1;
my $prog = 'blastn';
>>> my $db   = 'dr_est';
my $e_val= '1e-10';

my @params = ( '-prog' => $prog,
	       '-data' => $db,
	       '-expect' => $e_val );
>>> $Bio::Tools::Run::RemoteBlast::HEADER{'DB_DIR_PREFIX'} = 'dr_genome';
>>> $Bio::Tools::Run::RemoteBlast::HEADER{'WWW_BLAST_TYPE'} = 'dr_genes';

my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
$v = 1;
# ab077698 is a test human mRNA I was using, obviously whatever
# you want to use here is best
my $str = Bio::SeqIO->new(-file=>'ab077698.fa' , '-format' => 'fasta' );
my $input = $str->next_seq();

#  Blast a sequence against a database:
my $r = $factory->submit_blast($input);
print STDERR "waiting..." if( $v > 0 );
while ( my @rids = $factory->each_rid ) {
    foreach my $rid ( @rids ) {
	my $rc = $factory->retrieve_blast($rid);
	if( !ref($rc) ) {
	    if( $rc < 0 ) {
	    print STDERR "." if ( $v > 0 );
	    sleep 5;
	} else {
  	      my $result = $rc->next_result;
  	      print "db is ", $result->database_name(), "\n";
  	      my $count = 0;
  	      while( my $hit = $result->next_hit ) {
  		  next unless ( $v > 0);
  		  print "hit name is ", $hit->name, "\n";
  		  while( my $hsp = $hit->next_hsp ) {
  		      print "score is ", $hsp->score, "\n";

On Sun, 2 Jun 2002, james priest wrote:

> Does ::RemoteBlast have the capability to use any of the genomic blast
> databases at ncbi? I've been trying to blast against zebrafish ESTs by using
> any database name that I can find to no avail. Any suggestions?
> james
> --
> James Priest
> Eddy Rubin's Lab
> Lawrence Berkeley National Lab
> 1 Cyclotron Rd, MS 84-171
> Berkeley, CA 94720
> jpriest@uclink.berkeley.edu
> 510-486-7498
> 510-486-4229 Fax
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l

Jason Stajich
Duke University
jason at cgt.mc.duke.edu