[Bioperl-l] Re: Exonerate vulgar lines; SearchIO model

Jason Stajich jason at cgt.duhs.duke.edu
Sun Sep 21 09:07:42 EDT 2003


On Sun, 21 Sep 2003, Ewan Birney wrote:

>
> I have added vulgar line parsing to the exonerate output. I made
> a simpler model than the cigar line parsing of having just the M
> state durations as being HSPs.

cool!  Is there a switch in SearchIO::exonerate to look for '^cigar'
versus '^vulgar' lines?

>
> (this will get checked into the main trunk)
>
>
> This brings up an issue - should HSPs in the SearchIO objects
> be ungapped or gapped? It looks as if gapped cases are allowed
> - is this the case? Should we flag ungapped vs gapped (or is
> this done already somehow?)
>

Basically both are allowed.  That flag would be the gap count in the HSP
I guess.  In general they should be ungapped, but we handle 'small gaps' in
the sense that FASTA or BLAST HSPs can contain gaps (whose location we
do have access to by virture of the gap charater in homology line).

In fact the HitI->gaps call only collects the count of all the gaps for
the contained HSPs - ala exons on genomic DNA we might count gaps in the
(cDNA) exon alignment but not the overall (intron) gaps introduced by the
alignment.  What do you think should be done?

>
> I am starting to grok more why this event passing system is useful
> (abstracts out parsing from object creation, more graceful about
> partial information etc) but it does seem... quite alot of
> scaffolding...
>

I agree - now that it is sort gotten out there and we at least have
something to evaluate, I am game for looking at some refactoring.  Had to
make it work first...


>
> I guess we should make SeqIO work like this at some point, but
> that's definitely not in my critical path at the moment.
>
Ditto.

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu


More information about the Bioperl-l mailing list