[Bioperl-l] Remote Blast - Blast Human Genome

Chris Fields cjfields at uiuc.edu
Thu Jul 20 23:02:08 UTC 2006


Nice to know!  I'll add this to the wiki.

Chris

On Jul 20, 2006, at 5:40 PM, Cook, Malcolm wrote:

> Rohan,
>
> 'snp/human/human_snp' is the database name you need to use to blast  
> into
> human snp database at NCBI
>
> See the following document for the full list (which link was  
> provided to
> me via personal correspondace with NCBI helpdesk).  Very useful...
>
> Hmm, looming again, there appear now to be two versions:
>
> http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/blastdblist.html (last
> updated 2/7/2006)
> http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/ 
> remote_accessible_blastdbli
> st.html (last uypdated 5/29/2006)
>
> Neither are linked to by any other document on the internet (google  
> sez)
> including anywhere else at NCBI.  Go figure.  It should be IMHO since
> this info is nowhere else collected.
>
> Of course it may be out of date, but it always has got me through.
>
> Good luck
>
> Malcolm Cook - mec at stowers-institute.org - 816-926-4449
> Database Applications Manager - Bioinformatics
> Stowers Institute for Medical Research - Kansas City, MO  USA
>
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org
>> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Chris  
>> Fields
>> Sent: Monday, July 17, 2006 4:26 PM
>> To: vrramnar at student.cs.uwaterloo.ca; bioperl-l at lists.open-bio.org
>> Subject: Re: [Bioperl-l] Remote Blast - Blast Human Genome
>>
>> Okay, I think I may know what's going on a little more now
>> with NCBI's BLAST
>> interface.  Looks like any NCBI BLAST query must use the
>> default URL and so
>> must set up to proper GET/PUT commands to retrieve everything
>> correctly.
>>
>> Here's the API description for it all:
>>
>> http://www.ncbi.nlm.nih.gov/BLAST/Doc/urlapi.html
>>
>> You could try setting the database to 'snp' or something along
>> those lines
>> instead of 'nr'; or you could see what the name of the
>> database is when you
>> use the web form and try setting it to that.  According to
>> this page, this
>> should be possible:
>>
>> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpsnpfaq.sectio
>> n.SearchdbSNP
>> _test._Search_dbSNP_Using_B
>>
>> The Entrez Query limit was a recommendation for limiting your
>> search to a
>> set of sequences for human, for instance.
>>
>> I'll try looking into it a bit more but I'm pretty busy.  If you find
>> anything out you should probably post it here .
>>
>> Chris
>>
>>> Hi Chris,
>>>
>>> 1. I have tried changing the database to snp or dbSNP but
>> neither works.
>>> It
>>> seems that depending on which type of blast you use(ie, Genome  
>>> Blast,
>>> Blast SNP,
>>> normal blast such as blastn, etc...) you see a different listing of
>>> databases
>>> available for querys. Since you mention that the Blast page I see  
>>> was
>>> generated
>>> by Genome, where could I go to see a complete listing of
>> databases I can
>>> query??
>>> Or if you knew off hand which database to search if I only
>> wanted dbSNP
>>> hits?
>>>
>>> 2. You also mention, I can limit the search by using Entrez
>> terms. Do you
>>> mean
>>> like:
>>> $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'abc';
>>> where 'abc' is the name of the subject with which you would
>> only like to
>>> see
>>> result of. For example if you put it as 'Homo
>> sapiens[Organism]' then only
>>> human
>>> sequences would be in hit lists.
>>> If this is what you mean, what would I change it to, to see
>> only hits from
>>> dbSNP?
>>>
>>> Thanks for the ongoing help,
>>>
>>> Rohan
>>>
>>> Quoting Chris Fields <cjfields at uiuc.edu>:
>>>
>>>> I added a method to RemoteBlast in bioperl-live (CVS) if
>> you want to
>>> play
>>>> with changing the URL.  I have been thinking about doing
>> this for a bit
>>> now
>>>> but I already see problems.
>>>>
>>>> Here's the issue: the BLAST page you see is NOT the NCBI BLAST page
>>> (note
>>>> the differences in the URL) but a user-friendly request
>> page, generated
>>> on
>>>> the fly by Genome, to submit BLAST requests for the
>> relevant database.
>>> So
>>>> changing the URL will not work (even by adding extra
>> parameters); you
>>> only
>>>> get the original HTML web page.
>>>>
>>>> You could try changing the database or limiting the search using an
>>> Entrez
>>>> term (which you should be able to include in the request,
>> probably by
>>> adding
>>>> it to the HEADER).
>>>>
>>>> Chris
>>>>
>>>>> -----Original Message-----
>>>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>>>>> bounces at lists.open-bio.org] On Behalf Of
>>> vrramnar at student.cs.uwaterloo.ca
>>>>> Sent: Thursday, July 13, 2006 5:39 PM
>>>>> To: bioperl-l at lists.open-bio.org
>>>>> Subject: [Bioperl-l] Remote Blast - Blast Human Genome
>>>>>
>>>>>
>>>>> Hello Again,
>>>>>
>>>>> I have another question regarding Remote blast but this
>> time using
>>> Genome
>>>>> Blast.
>>>>>
>>>>> Here is the link:
>>>>>
>>>>>
>>>
>> http://www.ncbi.nlm.nih.gov/genome/seq/BlastGen/BlastGen.cgi? 
>> taxid=9606
>>>>>
>>>>> which again uses the main Blast web site:
>>>>>
>>>>> http://www.ncbi.nlm.nih.gov/BLAST/Blast.cgi
>>>>>
>>>>> Again I am not sure what to add or what HEADER
>> information to change
>>>>> within my
>>>>> script.
>>>>>
>>>>> Here is my program, which was the same as the last email:
>>>>>
>>>>> #!/usr/bin/perl -w
>>>>>
>>>>> use Bio::Perl;
>>>>> use Bio::Tools::Run::RemoteBlast;
>>>>>
>>>>> my $prog = "blastn";
>>>>> my $db = "refseq_genomic";
>>>>> my $e_val = 0.01;
>>>>>
>>>>> my @params = (	'-prog' => $prog,
>>>>> 		'-data' => $db,
>>>>> 		'-expect' => $e_val);
>>>>>
>>>>> my $factory = new Bio::Tools::Run::RemoteBlast->new(@params);
>>>>> $Bio::Tools::Run::RemoteBlast::HEADER{'WWW_BLAST_TYPE'}
>> = '????';  <--
>>> ---
>>>>> what
>>>>> do I put here
>>>>> #$Bio::Tools::Run::RemoteBlast::HEADER{'?????'} =
>> '????';  <--- Do I
>>> need
>>>>> to add
>>>>> any other values to the form inputs
>>>>>
>>>>> $factory->submit_blast("blast.in");
>>>>> $v = 1;
>>>>>
>>>>> while (my @rids = $factory->each_rid)
>>>>> {  foreach my $rid ( @rids )
>>>>>    {  my $rc = $factory->retrieve_blast($rid);
>>>>>       if( !ref($rc) )
>>>>>       {  if( $rc < 0 )
>>>>>          {  $factory->remove_rid($rid);
>>>>>          }
>>>>>          print STDERR "." if ( $v > 0 );
>>>>>          sleep 5;
>>>>>       }
>>>>>       else
>>>>>       {  my $result = $rc->next_result();
>>>>>          my $filename = $result->query_name()."\.out";
>>>>>          $factory->save_output($filename);
>>>>>          $factory->remove_rid($rid);
>>>>>          print "\nQuery Name: ", $result->query_name(), "\n";
>>>>>       }
>>>>>    }
>>>>> }
>>>>>
>>>>>
>>>>> Both of my questions are very similiar as in I know how
>> to use remote
>>>>> blast but
>>>>> not sure what to change to access the specific blast I want.
>>>>>
>>>>> Again, any help would be very appreciated!!
>>>>>
>>>>> Rohan
>>>>>
>>>>>
>>>>>
>>>>> ----------------------------------------
>>>>> This mail sent through www.mywaterloo.ca
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>
>>>
>>>
>>>
>>> ----------------------------------------
>>> This mail sent through www.mywaterloo.ca
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign






More information about the Bioperl-l mailing list