[Bioperl-l] A problem parsing BLASTX 2.2.8 reports

Jason Stajich jason at cgt.duhs.duke.edu
Tue Mar 30 12:25:37 EST 2004


Matt - I'm having a little trouble understanding the problem - why aren't
the value you are reporting what you expect.

With BLASTX the query will have a frame (0,1,2) in GFF/bioperl not
(1,2,3). So in your example the query will have some frame [1] and some
strand [1] since frame is '+'.  The hit will have no strand since it is
protein [0].  Isn't that what you got?

$hit->frame is going to return something different depending on what type
of search you did also.  For TBLASTN or BLASTX it will return the valid
frame for whatever makes sense (hit or query) and will return an array for
TBLASTX.

Also, the Hit object will try and make a summary value for all the HSPs -
this will be the frame for all the HSPs (if they share the same frame
throughout) or just the frame of the first HSP if they differ.

I don't really like to rely on this personally.  Rather I would call it
explicitly for the HSP:

$hsp->query->frame or $hsp->hit->frame
 or
$hsp->frame('query'), $hsp->frame('hit')
 or
my ($qframe,$hframe) = $hsp->frame;

All in all it is hard to say without a copy of the report(s) and your
code.

-jason

On Tue, 30 Mar 2004, Matthew Links wrote:

> I have run into a problem parsing BLASTX reports (version 2.2.8). When I
> ask for strand and frame on the Bio::Search::Hit::HitI object I am
> getting back the wrong answer.
>
> I think this has to do with a slight formatting change in the BLAST
> output.
>
> --- BLASTX 2.2.5 ---
>
> >gi|19879878|gb|AAM00191.1| guanine nucleotide-exchange protein GEP2
>             [Oryza sativa]
>           Length = 1789
>
>  Score =  320 bits (819), Expect(2) = 7e-98
>  Identities = 160/193 (82%), Positives = 173/193 (89%)
>  Frame = +2
>
> --- BLASTX 2.2.8 ---
>
> >gi|38346787|emb|CAE02205.2| OSJNBa0095H06.12 [Oryza sativa (japonica
>             cultivar-group)]
>           Length = 1724
>
>  Score =  102 bits (254), Expect = 5e-21
>  Identities = 62/204 (30%), Positives = 103/204 (50%), Gaps = 30/204
> (14%)
>  Frame = +2
>
> In my debugging it looks like everything is ok except for the
> strand/frame data. Which when parsing 2.2.8 gets
>
> hit->frame = 1
> hit->strand('query') = 1
> hit->strand('hit') = 0
>
> Has anyone seen this problem before?
>
> Thanks in advance,
>
> Matt
>
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu


More information about the Bioperl-l mailing list