[Bioperl-l] Blast stuff

Jason Stajich jason at cgt.mc.duke.edu
Sun Jan 26 16:24:01 EST 2003


On Sun, 26 Jan 2003, Benjamin Berman wrote:

> Dear Jason, Steve, others,
>
> I am trying to parse a wublast-2 output file using either the 'blast' or
> 'psiblast' SearchIO modules, and getting results I don't understand.  I am
> attaching both the blast file (test.bla) and the short test script
> (for_jason.pl) to this mail.  Here are the results when I run the test script:
>
> benb at sos: ./for_jason.pl ./test.bla blast
> Hit to CG13158 ==>  E=0.00e+00  query=(96.00% or 379 AAs of "CG13158-PA" at
> 100.00% iden)  subject=(28.00% or 379 AAs
> of "CG13158" at 100.00% iden)
>          Query inds ==> 1-396
>          Hit inds ==> 1-259,278-345,364-415,435.333333333333-452.333333333333
> benb at sos:
> benb at sos: ./for_jason.pl ./test.bla psiblast
> Hit to CG13158 ==>  E=0.00e+00  query=(96.00% or 379 AAs of "CG13158-PA" at
> 100.00% iden)  subject=(84.00% or 379 AAs
> of "CG13158" at 100.00% iden)
>          Query inds ==> 1-396
>          Hit inds ==> 1-259,278-345,364-415,435.333333333333-452.333333333333
>
> When I look at the blast output, it is clear to me that 396 AAs are aligned
> at 100% identity.  This is consistent with the output of seq_inds.  But the
> output of length_aln and frac_aligned seem to be wrong (both parsers put
> this at 379, missing the last HSP of 18 AAs).

> Additionally, the psiblast
> one says that this is 84% of the subject entry, while it should actually be
> 28% as 'blast' reports.

I think this depends on whether or not you mean frac_identity of the
actual query sequence (genomic DNA for the gene in this case?) in which
case the % identity should be 28% - if you mean frac_identical for the
translated sequence (length/3) then it is 84%.  You are expecting 28%
though?

> I am using an up-to-date version of the bioperl-live repository.
>
> Am I doing something wrong?

No - I think you are hitting less well used code - I'm going through it
now.  It looks like there is a bug in Bio::Search::SearchUtils::tile_hsps
I'll see if I can locate the problem - basically the calculated
effective query_length that is participating in the alignment for the 3
HSPs was getting calculated at 379 when it should be 396.


>
> thanks,
> ben.
>
>
>
>
> At 12:20 PM 12/2/2002 -0800, Steve Chervitz wrote:
> >--- Ewan Birney <birney at ebi.ac.uk> wrote:
> > >
> > > [snip]
> > >
> > > Bio::Tools::Blast.pm
> > >
> > >   I want to kill this - it has lots of confusing documentation; it is big;
> > > I think Bio::SearchIO is the way forward and is stable enough to take it
> > > on; I will bug fix Bio::SearchIO on spec for the 1.2 series (phew!). So...
> > > can we remove it and put in a warning in a BEGIN block saying - it has
> > > died, please try out Bio::SearchIO.
> > >
> > >
> > >   SteveC and Jason - what are your votes here.
> >
> >R.I.P. ;-)
> >
> >All of it's functionality has been taken over by other modules for some time
> >now, so I feel it's ready to go away for 1.2. The Bio/Tools/Blast directory
> >goes away as well.
> >
> >The warning message in the BEGIN block can be a bit more detailed, something
> >like a table such as:
> >
> >   Functionality         Module to Use
> >   --------------        --------------
> >   BLAST parsing         Bio::SearchIO
> >   BLAST running         Bio::Tools::Run::StandAloneBlast or
> >                         Bio::Tools::Run::RemoteBlast
> >   BLAST HTML formating  Bio::SearchIO::Writer::HTMLResultWriter
> >
> >TODO: Transfer the non-BLAST-specific docs from the module-level POD in
> >Bio::Tools::Blast over to Bio::SearchIO, and update it for the SearchIO
> >system.
> >This would include things like how to use the SearchIO::Writer modules. This
> >would be task for Jason or I.
> >
> >Lately I've been working on merging my psiblast.pm SearchIO module into
> >Jason's
> >blast.pm so that we can get down to just a single Blast parsing system.
> >
> >Steve
> >
> >
> >=====
> >Steve Chervitz
> >sac at bioperl.org
> >
> >__________________________________________________
> >Do you Yahoo!?
> >Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
> >http://mailplus.yahoo.com
> >_______________________________________________
> >Bioperl-l mailing list
> >Bioperl-l at bioperl.org
> >http://bioperl.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu


More information about the Bioperl-l mailing list