[Bioperl-l] seq_inds method question for blast parsing...

Sajeev Batra US-OPERON-Alameda batra@OPERON.com
Sat, 19 Oct 2002 15:25:40 -0700


Hi Bioperl users,

I've been using SearchIO for both NCBI and WashU BLASTN parsing as follows:

my $in     = Bio::SearchIO->new( -format => 'psiblast' );
my $writer = MyBlastWriter->new();
my $out    = Bio::SearchIO->new( -format => 'psiblast',
				 -writer => $writer );
while ( my $result = $in->next_result() ) {
    #printf STDERR "Report %d: $result\n", $in->report_count;
    $out->write_result($result);
}


seq_inds method usage is here in my to_string subroutine:
my @seq_inds_array = $hit->seq_inds('query','identical',1);
my @test_inds_array = $hit->seq_inds('subject','identical',1)

The identities reported back are the correct number when the alignment has
no gaps in it. :-)
However when the alignment has at least one gap in it (my example has two
gaps).  The 
number of indenties reported back are incorrect.  Some of the identity
positions reported back
are also off.  Usually the number of identities reported back are greater
than the actual number.

I'm wondering if anyone else has noticed that or could it be that there is
another technique
to account for gaps in a blast alignment?

Your feedback would be appreciated!

Thanks,
Sajeev.