[Bioperl-l] Can't parse blast report written by Bio::SearchIO::Writer::TextResultWriter

Prachi Shah prachi at stanford.edu
Mon May 12 23:26:41 UTC 2008


Hi Jason,

The negative coordinates in the HSP show up when I generate a Text
report regardless of how/if I sort the HSP order. I think it has
something to do with the frame. In the example I gave, the Query
sequence matches the subject sequence on the negative strand. My guess
is that TextResultWriter somehow takes the strand into account and
tries to recalculates the start and stop locations?

Thanks,
Prachi

On Mon, May 12, 2008 at 4:21 PM, Jason Stajich <jason at bioperl.org> wrote:
> that's a very strange bug - I don't quite understand where it is coming
> from.  IF you don't mess with the HSP order and start with a report and
> generate the Text report output, does it also give the negative coordinates
> or are you still reconstituting the Hit/HSP objects "manually" in your code?
>
>  -jason
>
>
>  On May 12, 2008, at 4:17 PM, Prachi Shah wrote:
>
>
> > Thanks Jason for adding the sort_hsps method in
> > Bio::Search::Hit::GenericHit. I tested it out and it works great.
> >
> > The other issue I have is the format of HSP start and stop coordinates
> > when I write a new blast report (with HSPs sorted) using
> > Bio::SearchIO::Writer::TextResultWriter. Below is an example of the
> > same HSP alignment as output from BLAST and later when the blast
> > report is generated by TextResultWriter. Notice, the change in start
> > and stop coordinates. I would like to keep the start and stop format
> > as in the first case. How do I specify that? Any indicators are
> > greatly appreciated.
> >
> > Thanks,
> > Prachi
> >
> >
> ----------------------------------------------------------------------------------------------------
> > **HSP alignment in blast report generated by BLAST itself:
> >
> >  Score = 10150 (1529.0 bits), Expect = 0., Sum P(3) = 0.
> >  Identities = 2120/2345 (90%), Positives = 2120/2345 (90%), Strand =
> > Minus / Plus
> >
> > Query:    2364
> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG 2305
> >               ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 2251160
> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG
> > 2251219
> >
> > Query:    2304
> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA 2245
> >               ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 2251220
> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA
> > 2251279
> >
> > Query:    2244
> ATGGTGAAACGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNC 2185
> >               ||||||||||||||                                             |
> > Sbjct: 2251280
> ATGGTGAAACGTTTTTAGTATTATTATTGTTAGTGCTGTTGTTATTATTATTATTATTAC
> > 2251339
> >
> > Query:    2184
> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG 2125
> >               ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 2251340
> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG
> > 2251399
> >
> > Query:    2124
> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG 2065
> >               ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 2251400
> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG
> > 2251459
> >
> > Query:    2064
> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG 2005
> >               ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 2251460
> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG
> > 2251519
> >
> > Query:    2004
> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG 1945
> >               ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 2251520
> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG
> > 2251579
> >
> > Query:    1944
> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA 1885
> >               ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 2251580
> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA
> > 2251639
> >
> > Query:    1884
> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA 1825
> >               ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 2251640
> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA
> > 2251699
> >
> >
> >
> ----------------------------------------------------------------------------------------------------
> > ** HSP alignment written by TextResultWriter:
> >
> >  Score = 1529.0 bits (10150), Expect = 0., P = 0.
> >  Identities = 2120/2345 (90%)
> >  Frame =  -1 / +1
> >
> > Query: 20
> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG -39
> >               ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 2251160
> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG
> > 2251219
> >
> > Query: -40
> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA -99
> >               ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 2251220
> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA
> > 2251279
> >
> > Query: -100
> ATGGTGAAACGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNC -159
> >               ||||||||||||||                                             |
> > Sbjct: 2251280
> ATGGTGAAACGTTTTTAGTATTATTATTGTTAGTGCTGTTGTTATTATTATTATTATTAC
> > 2251339
> >
> > Query: -160
> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG -219
> >               ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 2251340
> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG
> > 2251399
> >
> > Query: -220
> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG -279
> >               ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 2251400
> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG
> > 2251459
> >
> > Query: -280
> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG -339
> >               ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 2251460
> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG
> > 2251519
> >
> > Query: -340
> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG -399
> >               ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 2251520
> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG
> > 2251579
> >
> > Query: -400
> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA -459
> >               ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 2251580
> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA
> > 2251639
> >
> > Query: -460
> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA -519
> >               ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 2251640
> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA
> > 2251699
> >
>
>



More information about the Bioperl-l mailing list