[Bioperl-l] recovering blast query_name

Wiepert, Mathieu Wiepert.Mathieu@mayo.edu
Wed, 20 Nov 2002 15:57:14 -0600


Sorry for all the posts :-| latest RemoteBlast has moved to bioperl-run, not bioperl-live...

http://doc.bioperl.org/bioperl-run/


-Mat

Mathieu Wiepert
Medical Informatics Research
Mayo Foundation
(507) 266-2317 Fax (507)-284-0360
wiepert.mathieu@mayo.edu

> -----Original Message-----
> From: Wiepert, Mathieu 
> Sent: Wednesday, November 20, 2002 3:51 PM
> To: 'Lewis Lukens'; bioperl-l@bioperl.org
> Subject: RE: [Bioperl-l] recovering blast query_name
> 
> 
> Hi,
> 
> I made a few assumptions with the previous answer, sorry. You 
> need bioperl-live to get that to work, I don't think it is in 
> the 1.02 distro.  
> 
> Additionally, I only tested with fasta files, I assume that 
> anything else will still work, as long as the sequence has a 
> description.  The query name is built up like
> 
> 	$header{'QUERY'} = ">".(defined $seq->display_id() ? 
> $seq->display_id() : "").
> 		" ".(defined $seq->desc() ? $seq->desc() : 
> "")."\n".$seq->seq();
> 
> so, the sequences have to have a display id and description 
> to get a query name?
> 
> 
> My previous example was only slightly off, I left out the 
> description. 
> 
> >U20499_EXON_1A 2848-2960 of U20499
> acactggaccttcaaaaccctcagggcagagagcagccctacactccctacaccacaccc
> atactcagcccctgcaggcaaggagagaacaggtcaggttcccgagagctcag
> 
> results in query name of
> U20499_EXON_1A 2848-2960 of U20499
> 
> parsed from the header of this blast result (saved from the 
> remote blast)
> 
> BLASTN 2.2.4 [Aug-26-2002]
> 
> 
> Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro 
> A. Schaffer, 
> Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
> "Gapped BLAST and PSI-BLAST: a new generation of protein 
> database search
> programs",  Nucleic Acids Res. 25:3389-3402.
> RID: 1033569396-029169-20578
> Query= U20499_EXON_1A 2848-2960 of U20499
>          (113 letters)
> 
> Database: All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS, GSS,
> or phase 0, 1 or 2 HTGS sequences) 
>            1,406,693 sequences; 6,799,009,920 total letters
> 
> Check the actual blast results, and make sure that has the 
> query name in it, if it doesn't, then we have a problem...
> 
> Here is the more current documentation 
> http://doc.bioperl.org/bioperl-live/Bio/Tools/Run/RemoteBlast.html
> 
> -Mat
> 
> 
> > -----Original Message-----
> > From: Lewis Lukens [mailto:llukens@uoguelph.ca]
> > Sent: Wednesday, November 20, 2002 2:49 PM
> > To: bioperl-l@bioperl.org
> > Subject: [Bioperl-l] recovering blast query_name
> > 
> > 
> > Hello,
> > 
> > Sorry for a basic question... I have been trying to use the 
> > Bio::Tools:Run:RemoteBlast module to blast a single file with many 
> > fasta formated sequences against ncbi nt and parse the 
> blast reports. 
> > Almost everything is working well.  I get all the hit and hsp 
> > features for all the hits.  I can recover the query sequence, but I 
> > can't seem to recover the query sequence names.  How does 
> one do this?
> > 
> > I used almost the exact code as in the Remoteblast Synopsis
> > http://doc.bioperl.org/releases/bioperl-1.0.2/Bio/Tools/Run/Re
> > moteBlast.html
> > 
> > in this code, this expression works:
> > print "db is ", $result->database_name(), "\n";
> > 
> > but, these expressions return empty fields:
> >      my $name = $result->query_name();
> >      my $desc = $result->query_description();
> >      my $acc= $result->query_accession();
> > 
> > I have been using SearchIO to parse blast output files and 
> never had 
> > this problem before.  Any ideas?
> > 
> > Thanks much,
> > Lewis
> > -- 
> > Lewis Lukens
> > Assistant Professor
> > Department of Plant Agriculture
> > Univ. of Guelph, Guelph, Ontario. N1G 2W1
> > 
> > Tel: (519) 824- 4120 ext 2304
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-l
> > 
>