[Bioperl-l] Parsing BLAST 2.2.14 output

Chris Fields cjfields at uiuc.edu
Thu Jun 15 21:06:13 UTC 2006


Bio::SearchIO can't handle HTML output directly; you have to junk the tags
first, and we can't really guarantee anymore that will work either (I
haven't tried it).  The FAQ tells you how:

http://www.bioperl.org/wiki/FAQ

I would avoid HTML parsing altogether.  The only sure-fire method that will
always work, according to NCBI, is XML output, and that's parsable using
Bio::SearchIO::blastxml.  You can also try tabular format, which
Bio::SearchIO::blasttable can parse as well.

However, like Sendu, I get BLASTP 2.2.14 output (saved from NCBI directly)
to parse using bioperl-live; Bio::Tools::Run::RemoteBlast also seems to work
as well using BLASTP (and that's still set up to parse text output using
SearchIO I believe).  Could you give us an example of the type of BLAST you
were running, the sequence you used, and the error you had?  It could be
program-specific output that may be causing the problems.  The last time
text parsing broke it was changes specifically to only BLASTN/TBLASTX output
or something along those lines.

Chris

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Susan J. Miller
> Sent: Thursday, June 15, 2006 12:43 PM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Parsing BLAST 2.2.14 output
> 
> We are unable to parse BLAST 2.2.14 results from the NCBI website using
> SearchIO.  I have updated Bio::SearchIO::blast.pm to what's in
> bioperl-live, but when users download either plain text or HTML blast
> outputs from the NCBI page, SearchIO cannot parse them.  This used to
> work prior to BLAST 2.2.14.  Should I try installing the entire
> bioperl-live distribution?  (We are running Solaris 8 and perl 5.8 if
> that makes any difference.)
> 
> Thanks,
> -susan
> 
> Susan J. Miller
> Biotechnology Computing Facility
> Arizona Research Laboratories
> Bio West 228
> University of Arizona
> Tucson, AZ  85721
> (520) 626-2597
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list