[Bioperl-l] Remote Blast - Blast Human Genome

Chris Fields cjfields at uiuc.edu
Mon Jul 17 21:25:54 UTC 2006


Okay, I think I may know what's going on a little more now with NCBI's BLAST
interface.  Looks like any NCBI BLAST query must use the default URL and so
must set up to proper GET/PUT commands to retrieve everything correctly.  

Here's the API description for it all:

http://www.ncbi.nlm.nih.gov/BLAST/Doc/urlapi.html

You could try setting the database to 'snp' or something along those lines
instead of 'nr'; or you could see what the name of the database is when you
use the web form and try setting it to that.  According to this page, this
should be possible:

http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpsnpfaq.section.SearchdbSNP
_test._Search_dbSNP_Using_B

The Entrez Query limit was a recommendation for limiting your search to a
set of sequences for human, for instance.  

I'll try looking into it a bit more but I'm pretty busy.  If you find
anything out you should probably post it here .

Chris

> Hi Chris,
> 
> 1. I have tried changing the database to snp or dbSNP but neither works.
> It
> seems that depending on which type of blast you use(ie, Genome Blast,
> Blast SNP,
> normal blast such as blastn, etc...) you see a different listing of
> databases
> available for querys. Since you mention that the Blast page I see was
> generated
> by Genome, where could I go to see a complete listing of databases I can
> query??
> Or if you knew off hand which database to search if I only wanted dbSNP
> hits?
> 
> 2. You also mention, I can limit the search by using Entrez terms. Do you
> mean
> like:
> $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'abc';
> where 'abc' is the name of the subject with which you would only like to
> see
> result of. For example if you put it as 'Homo sapiens[Organism]' then only
> human
> sequences would be in hit lists.
> If this is what you mean, what would I change it to, to see only hits from
> dbSNP?
> 
> Thanks for the ongoing help,
> 
> Rohan
> 
> Quoting Chris Fields <cjfields at uiuc.edu>:
> 
> > I added a method to RemoteBlast in bioperl-live (CVS) if you want to
> play
> > with changing the URL.  I have been thinking about doing this for a bit
> now
> > but I already see problems.
> >
> > Here's the issue: the BLAST page you see is NOT the NCBI BLAST page
> (note
> > the differences in the URL) but a user-friendly request page, generated
> on
> > the fly by Genome, to submit BLAST requests for the relevant database.
> So
> > changing the URL will not work (even by adding extra parameters); you
> only
> > get the original HTML web page.
> >
> > You could try changing the database or limiting the search using an
> Entrez
> > term (which you should be able to include in the request, probably by
> adding
> > it to the HEADER).
> >
> > Chris
> >
> > > -----Original Message-----
> > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> > > bounces at lists.open-bio.org] On Behalf Of
> vrramnar at student.cs.uwaterloo.ca
> > > Sent: Thursday, July 13, 2006 5:39 PM
> > > To: bioperl-l at lists.open-bio.org
> > > Subject: [Bioperl-l] Remote Blast - Blast Human Genome
> > >
> > >
> > > Hello Again,
> > >
> > > I have another question regarding Remote blast but this time using
> Genome
> > > Blast.
> > >
> > > Here is the link:
> > >
> > >
> http://www.ncbi.nlm.nih.gov/genome/seq/BlastGen/BlastGen.cgi?taxid=9606
> > >
> > > which again uses the main Blast web site:
> > >
> > > http://www.ncbi.nlm.nih.gov/BLAST/Blast.cgi
> > >
> > > Again I am not sure what to add or what HEADER information to change
> > > within my
> > > script.
> > >
> > > Here is my program, which was the same as the last email:
> > >
> > > #!/usr/bin/perl -w
> > >
> > > use Bio::Perl;
> > > use Bio::Tools::Run::RemoteBlast;
> > >
> > > my $prog = "blastn";
> > > my $db = "refseq_genomic";
> > > my $e_val = 0.01;
> > >
> > > my @params = (	'-prog' => $prog,
> > > 		'-data' => $db,
> > > 		'-expect' => $e_val);
> > >
> > > my $factory = new Bio::Tools::Run::RemoteBlast->new(@params);
> > > $Bio::Tools::Run::RemoteBlast::HEADER{'WWW_BLAST_TYPE'} = '????';  <--
> ---
> > > what
> > > do I put here
> > > #$Bio::Tools::Run::RemoteBlast::HEADER{'?????'} = '????';  <--- Do I
> need
> > > to add
> > > any other values to the form inputs
> > >
> > > $factory->submit_blast("blast.in");
> > > $v = 1;
> > >
> > > while (my @rids = $factory->each_rid)
> > > {  foreach my $rid ( @rids )
> > >    {  my $rc = $factory->retrieve_blast($rid);
> > >       if( !ref($rc) )
> > >       {  if( $rc < 0 )
> > >          {  $factory->remove_rid($rid);
> > >          }
> > >          print STDERR "." if ( $v > 0 );
> > >          sleep 5;
> > >       }
> > >       else
> > >       {  my $result = $rc->next_result();
> > >          my $filename = $result->query_name()."\.out";
> > >          $factory->save_output($filename);
> > >          $factory->remove_rid($rid);
> > >          print "\nQuery Name: ", $result->query_name(), "\n";
> > >       }
> > >    }
> > > }
> > >
> > >
> > > Both of my questions are very similiar as in I know how to use remote
> > > blast but
> > > not sure what to change to access the specific blast I want.
> > >
> > > Again, any help would be very appreciated!!
> > >
> > > Rohan
> > >
> > >
> > >
> > > ----------------------------------------
> > > This mail sent through www.mywaterloo.ca
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> 
> 
> 
> 
> ----------------------------------------
> This mail sent through www.mywaterloo.ca




More information about the Bioperl-l mailing list