[Bioperl-l] Bio::SearchIO::hmmer hsp behaviour
cjfields at uiuc.edu
Wed Jul 5 15:36:33 UTC 2006
Okay, I managed to figure out what the problem was. I committed a fix in
CVS for the initial bug (Selvi's missing hits). Still has one HSP per hit
for now; I think it will take a bit more effort to get a BLAST-like multi
HSP/hit up and running.
Selvi, update from CVS to see if that works.
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Chris Fields
> Sent: Friday, June 30, 2006 12:44 PM
> To: Sendu Bala; Jason Stajich
> Cc: bioperl-l at lists.open-bio.org list
> Subject: Re: [Bioperl-l] Bio::SearchIO::hmmer hsp behaviour
> I'll try looking at it this weekend. A suggested workaround is to
> either try setting -A for no alignments or setting it to a high
> number to retrieve all of them. It's pretty serious as the error
> silently dumps those domains, so for those using automated annotation
> pipelines would miss it unless they are also checking the raw output.
> You could design a SearchIO::hmmpfam parser then expand it to take in
> hmmsearch output at a later point, or keep them separate. I like the
> idea of having modules that are more specific about what they parse;
> seems at some point you reach serious code bloat and maintenance
> becomes an issue. Look at SearchIO::blast; it parses various text
> BLAST output very well but with some serious obfuscation. Just don't
> know how productive it would be to separate out the PSI-BLAST and
> bl2seq stuff since they are pretty close to a standard BLAST
> report... oh well.
> To Jason : good luck on your move. Drop us a line here to let us
> know everything went well.
> On Jun 30, 2006, at 11:14 AM, Sendu Bala wrote:
> > Chris Fields wrote:
> >> It may have been just simpler to have it be one HSP (domain) per Hit
> >> (model) as that's how the reports are generated. My reasoning was
> >> that
> >> using the one domain per model made sense based on what you are
> >> actually
> >> trying to do, which is annotate the sequence based on the order the
> >> domain appears. Most others may not view it that way, which is fine.
> >> One can always gather the relevant HSP's, convert to seqfeatures,
> >> then
> >> sort them if order is important, I suppose.
> >> I would say, if the overall consensus is to modify it to have
> >> multiple
> >> domain hits per model (similar to BLAST) then Sendu should go
> >> ahead and
> >> make those changes then announce it on the list so no one can gripe
> >> about it later. My main concern was not changing things so
> >> dramatically
> >> that it'll break for someone
> > Going on your earlier suggestion, I was thinking about making
> > SearchIO::hmmpfam instead, which would get used if you set the
> > format to
> > 'hmmpfam' instead of the generic 'hmmer' when making a SearchIO. I
> > suppose I would make a SearchIO::hmmsearch as well, if necessary.
> > [...]
> >> that the reported bug about missing hits (Bug 2036) is fixed as well.
> > However, having never made a SearchIO plugin before, it will be some
> > time before I get my head around it. I'll want to make one the current
> > HOWTO:SearchIO way before I can think about doing it a better way
> > (hashes) as well. So I can say I'll make a move on this at some
> > point in
> > the future, but if someone wants to fix Bug 2036 in the mean time,
> > they
> > are welcome to. Again as suggested, my priority is Bio::Map right now.
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
More information about the Bioperl-l