[Bioperl-l] Re: [Bioperl-guts-l] Notification: incoming/1216 (fwd)

BHurwitz@twt.com BHurwitz@twt.com
Fri, 21 Jun 2002 08:36:13 -0500


Hi Jason,

Thank you so much for your reply.  I took a closer look at the output from
blastn and megablast and the only difference is that the order of three
lines are switched around at the top of the hit result.  The information is
the same but the order is reversed.  It seems to me that the output from
blastall, whether it be blast or megablast should be the same format.  I
just wrote to NCBI and asked them if they were planning on changing the
format of output from megablast to match blastn.  Depending on their reply,
I can write a new parser for megablast switching the order, but I hope they
can fix it on their end since it seems kind of silly to have slightly
different output format from the same program (blastall).  Below is an
example of the lines that are switched in the output.

MEGABLAST OUTPUT (top):

Database: /usr/local/bin/blast/db/chr22.fa
           1 sequences; 47,748,585 total letters

Searching.done
Query= 340113
         (45 letters)


BLASTN OUTPUT (top):

Query= 340113
         (45 letters)

Database: /usr/local/bin/blast/db/chr22.fa
           1 sequences; 47,748,585 total letters

Searching.done


PS.  I just started using bioperl and so far I love it!  One question, I
noticed that the "--n" option in blastall (for megablast) is not supported
in StandAloneBlast.pm, are you planning on supporting this?  Do you know
where I can request features to be added?

Thanks!
Bonnie Hurwitz



                                                                                                                                              
                    Jason Stajich                                                                                                             
                    <jason@cgt.mc        To:     Bioperl <bioperl-l@bioperl.org>                                                              
                    .duke.edu>           cc:     <BHurwitz@twt.com>                                                                           
                                         Subject:     [Bioperl-guts-l] Notification: incoming/1216 (fwd)                                      
                    06/20/2002                                                                                                                
                    09:19 PM                                                                                                                  
                                                                                                                                              
                                                                                                                                              




We have no megablast parser - active development effort for search result
parsers should be focused on new Bio::SearchIO plugins - I'm not sure if
the format is different enough to warrant a new module or if we can adapt
the Bio::SearchIO::blast one to read megablast too.

At any rate - there is currently no parser but we'll help you make it
bioperl friendly if you or someone else want to write it.

-jason

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu

---------- Forwarded message ----------
Date: Wed, 19 Jun 2002 10:34:56 -0400
From: bioperl-bugs@bioperl.org
To: bioperl-guts-l@bioperl.org
Subject: [Bioperl-guts-l] Notification: incoming/1216

JitterBug notification

new message incoming/1216

Message summary for PR#1216
           From: BHurwitz@twt.com
           Subject: BPlite - parsing megablast data
           Date: Wed, 19 Jun 2002 10:40:25 -0500
           0 replies           0 followups

====> ORIGINAL MESSAGE FOLLOWS <====

>From BHurwitz@twt.com Wed Jun 19 10:34:55 2002
Received: from venus.twt.com (one-six-five.invadercreator.com
[208.137.93.165])
           by pw600a.bioperl.org (8.12.2/8.12.2) with ESMTP id
g5JEYsal018823
           for <bioperl-bugs@bio.perl.org>; Wed, 19 Jun 2002 10:34:55 -0400
Subject: BPlite - parsing megablast data
To: "bioperl-bugs@bio.perl.org" <bioperl-bugs@bio.perl.org>
X-Mailer: Lotus Notes Release 5.0.5  September 22, 2000
Message-ID: <OF1D94E9A5.79CE0D2C-ON86256BDD.00550E9D@twt.com>
From: BHurwitz@twt.com
Date: Wed, 19 Jun 2002 10:40:25 -0500
X-MIMETrack: Serialize by Router on TWTMSNM1/TWT(Release 5.0.8 |June 18,
2001) at 06/19/2002
 10:40:40 AM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii

Hello,

I am not able to parse megablast data with the BPlite BLAST parser.  I
suspect that megablast output was not factored into the parser when it was
written since my parsing script works fine for regular blast data.  Will
megablast parsing capabilities be available in BPlite?

* error message when running BPlite on megablast output.

$ ./blast_parser.pl megablast.out megablast.parsed
-------------------- WARNING ---------------------
MSG: Possible error (1) while parsing BLAST report!
---------------------------------------------------
Use of uninitialized value in substitution (s///) at
/usr/lib/perl5/site_perl/5.6.1/Bio/Tools/BPlite.pm line 336, <GEN0> line
131.
Use of uninitialized value in substitution (s///) at
/usr/lib/perl5/site_perl/5.6.1/Bio/Tools/BPlite.pm line 337, <GEN0> line
131.
Use of uninitialized value in substitution (s///) at
/usr/lib/perl5/site_perl/5.6.1/Bio/Tools/BPlite.pm line 338, <GEN0> line
131.
Use of uninitialized value in pattern match (m//) at
/usr/lib/perl5/site_perl/5.6.1/Bio/Tools/BPlite.pm line 340, <GEN0> line
131.


* example code


#!/usr/bin/perl -w

use strict;
use Bio::Tools::BPlite;

###############################################
# blast_parser.pl
# written by B. Hurwitz
# 6/19/02
# This program parses blast output.
################################################

if (@ARGV != 2) { die "Usage: blast_parser.pl blast.in blast.parsed\n"; }
my $blast_in = $ARGV[0];
my $outfile = $ARGV[1];

open (OUT, ">$outfile") || die "Cannot open $outfile \n";

# main program
my $report = new Bio::Tools::BPlite(-file => $blast_in);
{
print OUT "> ", $report->query, "\t", $report->database, "\n";
   while (my $subject = $report->nextSbjct()) {
      while (my $hsp = $subject->nextHSP()) {
         print OUT join ("\t",
                     $hsp->score, #blast score
                     $hsp->bits, #number of bits in blast score
                     $hsp->percent, #percent identity
                     $hsp->P, #evalue
                     $hsp->length, #length of the hsp
                     $hsp->query->start, #starting bp of the query
                     $hsp->query->end, #ending bp of the query
                     $hsp->hit->start, #starting bp of the hsp
                     $hsp->hit->end, #ending bp of the hsp
                     $hsp->hit->seqname), "\n"; # name of the hsp
      } #end of while hsp
   } #end of while subject
# the following line takes you to the next report in the stream/file
# it will return 0 if that report is empty,
# but that is valid for an empty blast report.
# Returns -1 for EOF.
   last if ($report->_parseHeader == -1);
   redo;
}



_______________________________________________
Bioperl-guts-l mailing list
Bioperl-guts-l@bioperl.org
http://bioperl.org/mailman/listinfo/bioperl-guts-l