[Bioperl-l] New Blast parser

Chris Fields cjfields at uiuc.edu
Fri May 18 13:39:05 UTC 2007


I'll be looking at cleaning up SearchIO::blastxml soon myself.  It  
needs to be more memory-friendly with large XML files and PSI-BLAST  
iterations need to be addressed (nope, I haven't forgot about that!).

There is a XML::LibXML pull parser interface (XML::LibXML::Reader) we  
could look into...

chris

On May 18, 2007, at 3:13 AM, Torsten Seemann wrote:

> Sendu,
>
>> Back in August of last year I introduced Bio::PullParserI, a  
>> module that
>> aids in the creation of fast SearchIO and Search modules. I've  
>> finally
>> gotten around to implementing a Blast parser using the interface,  
>> which
>> I've called Bio::SearchIO::blast_pull.
>> my $sio = Bio::SearchIO->new(-format => "blast_pull", -file =>  
>> "file");
>> Please try it out and feed-back any bugs you discover.
>
> This is very cool!
> Here's hoping NCBI don't change the default output format too much.
>
> You should be able to add "rpsblast -p T" support as this is identical
> to "blastall -p blastp" except for first line:
> BLASTP 2.2.16 [Mar-25-2007]
> RPS-BLAST 2.2.16 [Mar-25-2007]
>
> The only problem is the (rarely used) "rpsblast -p F" mode which
> looks/behaves like a "blastall -p tblastn", ie. has hit summaries with
> "Frame"
>
>  Score = 29.6 bits (65), Expect = 0.26
>  Identities = 10/26 (38%), Positives = 12/26 (46%)
>  Frame = -1
>
> BUT has the same header line, so you can't know -p F was used until
> you see a "Frame = ??" in a hit (what were they thinking???).
>
> TBLASTN 2.2.16 [Mar-25-2007]
> RPS-BLAST 2.2.16 [Mar-25-2007]    # should be RPS-TBLASTN perhaps...
>
> Thanks for the good work. Shame I converted most of our systems to  
> blastxml :-(
>
> --Torsten
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign






More information about the Bioperl-l mailing list