[Bioperl-l] Bio::SearchIO::hmmer hsp behaviour

Chris Fields cjfields at uiuc.edu
Fri Jun 30 17:43:56 UTC 2006


I'll try looking at it this weekend.  A suggested workaround is to  
either try setting -A for no alignments or setting it to a high  
number to retrieve all of them.  It's pretty serious as the error  
silently dumps those domains, so for those using automated annotation  
pipelines would miss it unless they are also checking the raw output.

You could design a SearchIO::hmmpfam parser then expand it to take in  
hmmsearch output at a later point, or keep them separate.  I like the  
idea of having modules that are more specific about what they parse;  
seems at some point you reach serious code bloat and maintenance  
becomes an issue.  Look at SearchIO::blast; it parses various text  
BLAST output very well but with some serious obfuscation.  Just don't  
know how productive it would be to separate out the PSI-BLAST and  
bl2seq stuff since they are pretty close to a standard BLAST  
report... oh well.

To Jason : good luck on your move.  Drop  us a line here to let us  
know everything went well.

Chris

On Jun 30, 2006, at 11:14 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> It may have been just simpler to have it be one HSP (domain) per Hit
>> (model) as that's how the reports are generated.  My reasoning was  
>> that
>> using the one domain per model made sense based on what you are  
>> actually
>> trying to do, which is annotate the sequence based on the order the
>> domain appears.  Most others may not view it that way, which is fine.
>> One can always gather the relevant HSP's, convert to seqfeatures,  
>> then
>> sort them if order is important, I suppose.
>>
>> I would say, if the overall consensus is to modify it to have  
>> multiple
>> domain hits per model (similar to BLAST) then Sendu should go  
>> ahead and
>> make those changes then announce it on the list so no one can gripe
>> about it later.  My main concern was not changing things so  
>> dramatically
>> that it'll break for someone
>
> Going on your earlier suggestion, I was thinking about making
> SearchIO::hmmpfam instead, which would get used if you set the  
> format to
> 'hmmpfam' instead of the generic 'hmmer' when making a SearchIO. I
> suppose I would make a SearchIO::hmmsearch as well, if necessary.
>
>
> [...]
>> that the reported bug about missing hits (Bug 2036) is fixed as well.
>
> However, having never made a SearchIO plugin before, it will be some
> time before I get my head around it. I'll want to make one the current
> HOWTO:SearchIO way before I can think about doing it a better way
> (hashes) as well. So I can say I'll make a move on this at some  
> point in
> the future, but if someone wants to fix Bug 2036 in the mean time,  
> they
> are welcome to. Again as suggested, my priority is Bio::Map right now.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign






More information about the Bioperl-l mailing list