[Bioperl-l] New hmmpfam parser

Chris Fields cjfields at uiuc.edu
Mon Aug 21 00:25:32 UTC 2006


Sendu,

Could you post the example file you used somewhere for testing?

Chris

On Aug 20, 2006, at 4:56 PM, Sendu Bala wrote:

> I've added a new hmmpfam parser to bioperl-live.
>
> You access it with Bio::SearchIO::new(-format => "hmmer_pull"). It  
> uses
> the new Bio::PullParserI discussed in thread 'SearchIO speedup'.
>
> The major differences between it and the existing SearchIO plugin for
> hmmpfam reports (hmmer.pm) are speed, memory usage and how it deals  
> with
> hits and hsps. hmmer.pm breaks Bio::Search::HitI API by having hit
> (model) name()s that are not unique within a ResultI. It also only  
> ever
> has one domain per model. hmmer_pull.pm has unique model names and as
> many domains per model as there are in the file being parsed.
> hmmer_pull.pm also gives back more correct answers when you try to use
> the full variety of HitI, GenericHit, HSPI and GenericHSP methods.
>
>
> Speed tested on one example hmmpfam report of 441kb comparing hmmer.pm
> and hmmer_pull.pm:
> (memory usage was always ~1.8x less)
>
> # for the result for query sequence 'test5' (5th result of 10 in my
> # test dataset), just get the most significant domain of the most
> # significant model:
> # while ($result = $searchio->next_result) {
> #   if ($result->query_name eq 'test5') {
> #     $result->sort_hits(sub{#sort by significance});
> #     $hit = $result->next_hit;
> #     $hsp = $hit->hsp('best');
> #     last;
> #   }
> # }
> 23.5x faster
>
> # while ($result = $searchio->next_result) { # do nothing }
> 38x faster
>
> # while ($result = $searchio->next_result) {
> #   while ($hit = $result->next_hit) {
> #     while ($hsp = $hit->next_hsp) { # do nothing }
> 5.3x faster
>
> # while ($result = $searchio->next_result) {
> #   while ($hit = $result->next_hit) {
> #     while ($hsp = $hit->next_hsp) {
> #       $fi = $hsp->frac_identical('query');
> #     }
> (note that hmmer.pm returns the wrong answer for $fi: 0)
> 2.2x faster
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign






More information about the Bioperl-l mailing list