[Bioperl-l] hmmer3/hmmscan parser

Kai Blin kai.blin at biotech.uni-tuebingen.de
Thu May 27 14:50:40 UTC 2010


On Wed, 2010-05-26 at 08:25 -0700, Thomas Sharpton wrote:

> > Having not considered it too much, I'm not sure how to accomplish  
> > this without breaking the SearchIO idiom. But presumably a way could  
> > be found.
> >
> 
> I'll see if I can't hit the drawing board and come up with a naming  
> scheme for additional H3 methods that retrieve some of the extra data  
> encoded in the new reports. It *probably* makes most sense, at least  
> from the standpoint of the user's perspective, to adopt the full- 
> length report values as the standard hit->significance and hit- 
>  >raw_score while having something like hit->best_significance and hit- 
>  >best_score as H3 methods that return the best-domain report values.   
> Again, this could use some thought/discussion.

My reasoning for the change was that you can get at the best sequence
score by (at worst) iterating over the top sequences. Without the change
there was no way to get at the overall profile score, so that data was
lost. Arguably this is just one way to try and make the data from the
HMMer results accessible via the SearchIO interface.

> I was not a part of that conversation either and I'm also operating  
> under a similar assumption about what "integrating the hmmer.pm  
> parser" means.  I too am confused about the statement regarding  
> modularization; I assume Kai meant that next_result would leverage the  
> HMMER version number (which it already grabs) to guide the appropriate  
> parsing of the datafile.  Not thinking about this too carefully, it  
> might be a simple as:
> 
> next_result{
> 	version = get_hmmer_version
> 	if version == 2
> 		parse V2 report file
> 	if version == 3
> 		parse V3 report file
> }
> 
> to make the code a bit more manageable, the various version parsers  
> could be appropriated to independent subroutines.
> 
> Kai, is this along the lines of what you were thinking?

Yes, this is more or less what I meant. But I agree that we first want
to get the hmmer3 parser sorted out and working nicely. More test cases
for the parser would be nice, I just got sidetracked by another bug
affecting my code.

Cheers,
Kai

-- 
Dipl.-Inform. Kai Blin         kai.blin at biotech.uni-tuebingen.de
Interfakultäres Institut für Mikrobiologie und Infektionsmedizin
Abteilung Mikrobiologie/Biotechnologie
Eberhard-Karls-Universität Tübingen
Auf der Morgenstelle 28                 Phone : ++49 7071 29-78841
D-72076 Tübingen                        Fax : ++49 7071 29-5979
Deutschland
Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben




More information about the Bioperl-l mailing list