[Bioperl-l] bioperl 1.4 SearchIO doesn't work parsing Blast output

Chris Fields cjfields at uiuc.edu
Thu Feb 9 16:16:28 UTC 2006


> -----Original Message-----
> From: Jason Stajich [mailto:jason.stajich at duke.edu] 
> Sent: Thursday, February 09, 2006 9:13 AM
> To: Hubert Prielinger
> Cc: Chris Fields; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] bioperl 1.4 SearchIO doesn't work 
> parsing Blast output
> 
> On Feb 8, 2006, at 4:41 PM, Hubert Prielinger wrote:
> > hi chris,
> > thanks, I have upgraded to version 1.5.1 but it isn't still 
> working, 
> > do you have any ohter idea, the problem I have is that I 
> have to parse 
> > a lot of textfiles....
> > or shall I look for another option to parse those files...
> >
> > regards
> > Hubert
> 
> 
> The code from Bioperl 1.5.1 works fine for me for blast 
> 2.2.13 reports but unless you post your blast report we can't 
> really determine the problem.
> 
> If you are still getting the same error like this I am not 
> convinced you have upgraded to 1.5.1 which includes a fix in 
> the fact that NCBI changed the HSP result format to remove 
> the ':' from the Query/Sbjct prefixes.  We fixed this as soon 
> as it was apparent sometime in September.
> 
> >>> MSG: no data for midline Query  1   WWWKWRW  7
> >>> STACK Bio::SearchIO::blast::next_result
> >>> /usr/lib/perl5/site_perl/5.8.6/Bio/SearchIO/blast.pm:1151
> >>> STACK toplevel
> >>> 
> /home/Hubert/installed/eclipse/workspace/Database_Search/Blast.pl:21
> 
> If you are just getting no results but also no warnings wrt 
> parsing, are you sure your logic is correct?
> 
> If you remove your filters do you see all the HSPS?
> 
> 
> while (my $result = $search->next_result) {
>      print $result->query_name, "\n";
>      #iterate over each hit on the query sequence
>      while (my $hit = $result->next_hit) {
> 	print $hit->name, "\n";
>          #iterate over each HSP in the hit
>          while (my $hsp = $hit->next_hsp) {
> 	 print $hsp->evalue, " ", $hsp->length('sbjct'), " ", $hsp- 
>  >hit_string, "\n";	
>         }
>     }
> }

I tested some of the BLAST results that Hubert sent Roger and me with a
similar script to the above.  I removed the file parsing logic and it seemed
to work just fine.  It may very well be a logic issue or that he hasn't
installed the latest fix.
    
It's a funny thing, though.  When I tried using blastcl3 (v. 2.2.13), even
though the returned output was from nr, the top of the blast output showed
that it was v2.2.12:  

BLASTP 2.2.12 [Aug-07-2005]

I double-checked my local version and it's definitely v.2.2.13:
-------------------------------------
C:\Perl\Scripts>blastcl3 -

blastcl3 2.2.13   arguments:...
-------------------------------------

If you use RemoteBlast using the same settings, the version in the header
looks like this:

BLASTP 2.2.13 [Nov-27-2005]

I'm wondering if all the blast executables (blast and netblast) from NCBI
have text output like v.2.2.12, while the wwwblast outputs a new format
(2.2.13).  I'll ask blast-help at NCBI about this.

> 
> To clarify some stuff -
> Chris I don't necessarily think the XML is best way forward 
> for BLAST reports generated locally, it isn't as detailed as 
> the Text format and it is what most people expect to be able 
> to scroll through and parse -- it is also harder for the 
> format to change dramatically if you have a static binary on 
> your machine =).  I think for remoteblast the XML format 
> should be the way forward but I expect Bioperl to maintain 
> support of any plain text BLAST report format that people use 
> on a regular basis.
> 

Does XML lack some specific info that text output has?  Didn't know that.  I
believe that XML should be default in RemoteBlast since it will not break,
but I agree with you about text output.  I also agree that it will need
somebody to maintain it constantly, much like RemoteBlast.

> -jason
> >
> >
> > Chris Fields wrote:
> >
> >> My guess is you're running into text parsing problems in 
> >> Bio::SearchIO::blast.  Upgrade to the latest developer version
> >> (1.5.1) or
> >> bioperl-live (CVS), then see the bug below.
> >>
> >> http://bugzilla.bioperl.org/show_bug.cgi?id=1934
> >>
> >> I think the first problem you ran into is solved in bioperl 1.5.1, 
> >> the last problem (more recent, not related to the first) has been 
> >> fixed but hasn't been committed to bioperl-live yet.  The fixed 
> >> SearchIO::blast is available in the link above, but 
> realize it hasn't 
> >> been committed yet and may change.
> >>
> >> Christopher Fields
> >> Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry 
> >> University of Illinois Urbana-Champaign
> >>
> >>
> >>
> >>> -----Original Message-----
> >>> From: bioperl-l-bounces at lists.open-bio.org
> >>> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hubert 
> >>> Prielinger
> >>> Sent: Wednesday, February 08, 2006 2:52 PM
> >>> To: bioperl-l at bioperl.org
> >>> Subject: [Bioperl-l] bioperl 1.4 SearchIO doesn't work 
> parsing Blast 
> >>> output
> >>>
> >>> Hi,
> >>> If I want to parse a Blast Output (Version 2.2.12) with 
> >>> Bio::SearchIO, I get the following error message:
> >>>
> >>> MSG: no data for midline Query  1   WWWKWRW  7
> >>> STACK Bio::SearchIO::blast::next_result
> >>> /usr/lib/perl5/site_perl/5.8.6/Bio/SearchIO/blast.pm:1151
> >>> STACK toplevel
> >>> 
> /home/Hubert/installed/eclipse/workspace/Database_Search/Blast.pl:21
> >>>
> >>> is that a bug......
> >>>
> >>> If I want to parse Blast Output (version 2.2.13), I don't get 
> >>> anything.....
> >>> I'm using bioperl 1.4
> >>>
> >>> before, I have installed bioperl 1.4, it worked fine 
> parsing Blast 
> >>> Output (version 2.2.12), but I don't remember which 
> bioperl version 
> >>> I had installed
> >>>
> >>> thanks in advance
> >>>
> >>> Hubert
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>>
> >>
> >>
> >>
> >>
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
> 

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign  




More information about the Bioperl-l mailing list