[Bioperl-l] Bio::SearchIO - Accessing Model parameters (score, evalue, description)

Chris Fields cjfields at uiuc.edu
Thu Jun 29 03:40:28 UTC 2006


According to CVS, using -A0 (no alignments) is supposed to work since v.
1.5.1 and (I'm guessing here) should return HMMERHit/HMMERHSP objects with
no sequences, just the values from the table.  By this reasoning using -A5
should work but the first five Hit/HSP pairs will give you sequences and any
remaining should give nothing, just the Sequence Model combined evalue
(which you can get by $model->significance) and individual Domain (HSP-like)
evalues ($domain->evalue).  I don't get these either (I only get a max of 5
model/domain pairs). 

So, I tried a little experiment using the first single result output for
this query from your combined file (nbd27e02.y1  716 69 831 ; translated),
which was the first one I came across with more than five model/domain
pairs, and this scripted loop:

while ( my $result = $searchio->next_result() ) {
  print "Query: ",$result->query_name,"\n";
  while (my $model = $result->next_model) {
	print "\tModel : ",$model->name,"\n";
	print "\tSignif: ",$model->significance,"\n";
	while (my $domain = $model->next_domain) {
	  print "\t\tEvalue : ",$domain->evalue,"\n";
	}
  }
}

I get this with the file containing the alignments.  For anyone following,
I'm using bioperl-live, perl 5.8, WinXP:

Query: nbd27e02.y1  716 69 831 ; translated
        Model : IBB
        Signif: 2.6e-43
                Evalue : 2.6e-43
        Model : HEAT
        Signif: 1.2e-11
                Evalue : 40
        Model : IBN_N
        Signif: 2.1
                Evalue : 2.1
        Model : Arm
        Signif: 6e-38
                Evalue : 3.5e-12
        Model : HEAT
        Signif: 1.2e-11
                Evalue : 0.0096

If I manually delete the alignments (make it like -A0 output) I get this:

Query: nbd27e02.y1  716 69 831 ; translated
        Model : IBB
        Signif: 157.3
                Evalue : 2.6e-43
        Model : HEAT
        Signif: 52.1
                Evalue : 40
        Model : IBN_N
        Signif: -3.6
                Evalue : 2.1
        Model : Arm
        Signif: 139.5
                Evalue : 3.5e-12
        Model : HEAT
        Signif: 52.1
                Evalue : 0.0096
        Model : Arm
        Signif: 139.5
                Evalue : 2.2e-13
        Model : HEAT
        Signif: 52.1
                Evalue : 0.0032
        Model : Arm
        Signif: 139.5
                Evalue : 0.00019

i.e. all the model/domain pairs!  So I think it's safe to say that this is a
bug; the last few don't get processed but should.  I'll drop a bug report
into Bugzilla along with the test files and script so it can be confirmed.
This shouldn't be too hard to fix but it make take a few days; I'm pretty
busy here until Saturday.
 
Chris

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Selvi Kadirvel
> Sent: Wednesday, June 28, 2006 3:12 PM
> To: bioperl-l at lists.open-bio.org
> Cc: Selvi Kadirvel
> Subject: Re: [Bioperl-l] Bio::SearchIO - Accessing Model parameters
> (score,evalue, description)
> 
> Sendu,
> 
> >
> > What do you mean by this? What is ??A? ?
> > Is this an option you're supplying to hmmpfam or a bioperl module?
> 
> I was referring to the '-A' option when running hmmpfam. So if I were
> to use  '-A 5', Section 3 will have only the top scoring (first) five
> HSPs.
> 
> >
> > So you can say:
> > $hit->name, " ", $hit->description, " ", $hit->significance, " ",
> > $hit->score;
> >
> > To get the information you want.
> > General information about the result can be had like so:
> > print $result->query_name, " ", $result->algorithm, " ",
> > $result->hmm_name, "\n";
> 
> I do use the same methods that you have suggested. Let me try to
> explain my problem in detail. Lets say I have a report that was
> generated using this "-A 5" option. I want to get the description,
> score, evalue of a model that *does not* have a domain in the top 5
> high scoring HSPs. This information *exists* in the report in Section
> 1 but neither $result->next_hit or $hit->next_hsp can see it.
> 
> Details of ALL domains  are available through:
> 
>      foreach $domain ($result->each_Domain)
>      {
>             $domain-> [ hmmname, hmmacc, start, end, hstart, hend,
> evalue ]
>      }
> 
> where $result is a Bio::Tools::HMMER::Results object. But this again
> represents information in Section 2. It gives us domain scores and
> evalues (and not model scores and evalues.)
> 
> I am working around this by finding the sum of scores (evalues) of
> all domains in a model. But there seems to be no work-around to
> retrieve the description. $domain->hmmacc contains only the first
> string of the description.
> 
> -Selvi
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list