[Bioperl-l] parsing an html blast result file

Wes Barris wes.barris at csiro.au
Wed Jul 23 17:14:11 EDT 2003


Hi,

I have installed bioperl etal. on a Sun (Solaris8).  I would now like
to try parsing an html blast results file.  I saved example 4 from this
page into a file:

     http://www.bioperl.org/HOWTOs/html/Graphics-HOWTO.html

The only thing I changed in the file is the format of the input file
from this:

-format => 'blast') or die "parse failed";

to this:

-format => 'blastxml') or die "parse failed";

I am assuming that the format of an html blast result file is "blastxml",
but I could be wrong.  I could not find a list of valid formats that can
be used with the Bio::SearchIO->new constructor.

When I run the example 4 script, I get this error:

wes at sequence> blasttoimg.pl junk.html >junk.png

-------------------- WARNING ---------------------
MSG: error in parsing a report:

not well-formed (invalid token) at line 9, column 34, byte 238 at 
/usr/local/lib/perl5/site_perl/5.6.1/sun4-solaris/XML/Parser.pm line 185

---------------------------------------------------
no result at /home/wes/proj/blast/blasttoimg.pl line 15, <GEN1> line 669.

Could anyone suggest what I might try to make this work?

#!/usr/local/bin/perl

# This is code example 4 in the Graphics-HOWTO
use strict;
#use lib "$ENV{HOME}/projects/bioperl-live";
use Bio::Graphics;
use Bio::SearchIO;

my $file = shift or die "Usage: render4.pl <blast file>\n";

my $searchio = Bio::SearchIO->new(-file   => $file,
                                   -format => 'blastxml') or die "parse failed";


my $result = $searchio->next_result() or die "no result";

my $panel = Bio::Graphics::Panel->new(-length    => $result->query_length,
                                       -width     => 800,
                                       -pad_left  => 10,
                                       -pad_right => 10,
                                      );

my $full_length = Bio::SeqFeature::Generic->new(-start=>1,-end=>$result->query_length, 
-seq_id=>$result->query_name);

$panel->add_track($full_length,
                   -glyph   => 'arrow',
                   -tick    => 2,
                   -fgcolor => 'black',
                   -double  => 1,
                   -label   => 1,
                  );

my $track = $panel->add_track(-glyph       => 'graded_segments',
                               -label       => 1,
                               -connector   => 'dashed',
                               -bgcolor     => 'blue',
                               -font2color  => 'red',
                               -sort_order  => 'high_score',
                               -description => sub {
                                 my $feature = shift;
                                 return unless $feature->has_tag('description');
                                 my ($description) = $feature->each_tag_value('description');
                                 my $score = $feature->score;
                                 "$description, score=$score";
                                });

while( my $hit = $result->next_hit ) {
   next unless $hit->significance < 1E-20;
   my $feature = Bio::SeqFeature::Generic->new(-score   => $hit->raw_score,
                                               -seq_id => $hit->name,
                                               -tag     => {
                                                            description => $hit->description
                                                           },
                                              );
   while( my $hsp = $hit->next_hsp ) {
     $feature->add_sub_SeqFeature($hsp,'EXPAND');
   }

   $track->add_feature($feature);
}

print $panel->png;

-- 
Wes Barris
E-Mail: Wes.Barris at csiro.au



More information about the Bioperl-l mailing list