[Bioperl-l] Why does Bio::DB::GFF::Feature::gff3_string swap start and stop coordinates??

Mark Johnson johnsonm at gmail.com
Mon May 21 20:48:52 UTC 2007


Check the test data for Glimmer2 and Glimmer3.  They both predict one
large gene, I'd guess covering most of the sequence, in frame +1.
That's probably a bogus prediction, but that's not up to the parser to
decide.  I hadn't noticed it until recently.

I sent a patch via bugzilla to swap the coordinates if start > end and
strand > 0.

On 5/21/07, Chris Fields <cjfields at uiuc.edu> wrote:
> On May 16, 2007, at 2:11 PM, Mark Johnson wrote:
>
> > On 5/8/07, Chris Fields <cjfields at uiuc.edu> wrote:
> >> I believe all seqfeature location coordinates are designed to have
> >> start < stop for consistency; in cases where the strand matters (CDS,
> >> gene, etc.) then the strand is set to 1 or -1.  When start > stop,
> >> the two are reversed and the strand is flipped; at least that's the
> >> way locations are set up in BioPerl.
> >>
> >> chris
> >
> >     Oh yeah?  I always tend to ensure that (start < stop), regardless
> > of strand, when working with sequence features...the other day, I
> > caught Glimmer2 emitting a prediction on the plus strand with start >
> > stop.  I was going to work up a patch for the parser, but I wonder,
> > should I just force everything to start < stop?  Or only predictions
> > on the plus strand?  Should all the parsers for all the ab initio
> > predictors ensure they emit features with coordinates like this?
>
> Odd that it would predict a start > stop on the plus strand, though
> it may be corrected in Glimmer3.  Does the same prediction show up in
> Glimmer3?
>
> chris
>



More information about the Bioperl-l mailing list