[Bioperl-l] parsing only the summary part of a blast report

Ian Korf ikorf@sapiens.wustl.edu
Thu, 12 Oct 2000 13:11:26 -0500 (CDT)


I don't use HTML formatted blast reports, so I've never tried it. I'm not
sure what it would take to make it compatible since I've never seen HTML
formatted blast output (which I imagine could come in many flavors - it's
bad enough to have multiple blast versions). I'm also not sure what to do
about the two development braches. Part of the reason it's called BPlite
is that it is supposed to be lightweight, which bioperl isn't. So I'm a
little resistant to making the bioperl version the only version. So the
short answer is that I have no plans for a version that supports HTML,
sorry. As an alternative, you could write a script that strips the
specific HTML formatting and pipe that to BPlite.

-Ian

On Thu, 12 Oct 2000, Zhao, David  [PRI] wrote:

> Does your version of BPlite parse HTML formatted blast report?
> 
> > -----Original Message-----
> > From:	Ian Korf [SMTP:ikorf@sapiens.wustl.edu]
> > Sent:	Wednesday, October 11, 2000 8:40 AM
> > To:	Bioperl
> > Subject:	Re: [Bioperl-l] parsing only the summary part of a blast
> > report
> > 
> > The latest version of BPlite also reads concatenated blast reports. So you
> > can do the following:
> > 
> > my $multi = new BPlite::Multi(\*FILEHANDLE);
> > while (my $blast = $multi->nextBlast) {
> >     while (my $sbjct = $blast->nextSbjct) {
> >         print "$sbjct\n";
> >     }
> > }
> > 
> > Unfortunately, BPlite has two development branches, the bioperl one and my
> > own (sorry about that). But the new code should be trivial to migrate.
> > 
> > -Ian
> > 
> > On Wed, 11 Oct 2000, Jason Stajich wrote:
> > 
> > > see the perldoc for Bio::Tools::BPlite for all the complete api.
> > > 
> > > This script will parse and print all of the hits in your report.
> > > 
> > > #!/usr/local/bin/perl -w
> > > 
> > > use Bio::Tools::BPlite;
> > > 
> > > my $report = new Bio::Tools::BPlite(-fh=>\*STDIN);
> > > print $report->query(), ",", $report->database(), "\n";
> > > while ( my $sbjct = $report->nextSbjct ) {
> > >     print "name is ", $sbjct->name(), "\n";
> > > }
> > > 
> > > Alternatively you can use Bio::Tools::Blast, but you will find it slower
> > > and more memory intensive.  It does support more features so it depends
> > on
> > > what your needs are.
> > > 
> > > On Wed, 11 Oct 2000, Hilmar Lapp wrote:
> > > 
> > > > > "Zhao, David [PRI]" wrote:
> > > > > 
> > > > > Hi there,
> > > > > How can I parse the summary part of a blast report using bioperl
> > > > > modules? such as:
> > > > > 
> > > > 
> > > > I'm almost sure you can't using Blast.pm. Maybe you can with BPlite
> > > > (development trunk only). I know that Blast.pm takes a signifant time
> > to
> > > > parse long reports (i.e., with  hundreds of alignments), but we
> > haven't
> > > > checked yet whether BPlite is significantly faster in such cases. I
> > guess
> > > > you're asking because you bother about the time lost in parsing the
> > > > alignments, although you needed only the summary data which are
> > already
> > > > present in the hit list.
> > > > 
> > > > BTW you can pass a significance threshold to Blast.pm, and although
> > I'm
> > > > not sure I think it won't parse those alignments beyond the
> > significance
> > > > threshold.
> > > 
> > > [Regarding Bio::Tools::Blast]
> > > 
> > > I'm pretty sure the signifigance threshold only applies to when you are
> > > running Blast not parsing a report.  In fact you should not put a
> > > signif=>$value in your parameter hash if you are just parsing a report
> > > file.
> > > 
> > > > 
> > > > 	Hilmar
> > > > -- 
> > > > -----------------------------------------------------------------
> > > > Hilmar Lapp                                email: hlapp@gmx.net
> > > > NFI Vienna, IFD/Bioinformatics             phone: +43 1 86634 631
> > > > A-1235 Vienna                                fax: +43 1 86634 727
> > > > -----------------------------------------------------------------
> > > > _______________________________________________
> > > > Bioperl-l mailing list
> > > > Bioperl-l@bioperl.org
> > > > http://bioperl.org/mailman/listinfo/bioperl-l
> > > > 
> > > 
> > > Jason Stajich
> > > jason@chg.mc.duke.edu
> > > http://galton.mc.duke.edu/~jason/
> > > (919)684-1806 (office) 
> > > (919)684-2275 (fax) 
> > > Center for Human Genetics - Duke University Medical Center
> > > http://wwwchg.mc.duke.edu/ 
> > > 
> > > 
> > > 
> > > 
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l@bioperl.org
> > > http://bioperl.org/mailman/listinfo/bioperl-l
> > > 
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-l
>