[BioPerl] Re: [Bioperl-l] gff_string on an HSPI object is not Bio::DB::GFF friendly

Scott Cain cain at cshl.org
Mon Jan 12 11:18:53 EST 2004


Aaron,

I really doubt that the current release of GBrowse supports relative
coordinates as described by both you and Allen.  I have to say I'm not
sure, because I am in the process of developing a set of test data.

As for chado, it should actually be fairly easy to adapt it to work with
relative coordinates.  The main change (for me) would be in the gbrowse
chado adaptor, which assumes that all features have as the 'srcfeature'
the 'top' feature (ie, all features are directly laid on the
chromosome/arm/contig/whatever).  The reason it does that is because
that is the way that the fruitfly people use it, and so that was the
data I had to develop the adaptor for.

If having relative coordinates is something that would be useful for you
to use chado, let me know (and send me sample GFF3 data) and I will work
on it.  Otherwise, it will go in the TODO file.

Thanks,
Scott 

On Fri, 2004-01-09 at 17:22, Allen Day wrote:
> We don't support this in the chado load_gff3.pl script, but it wouldn't be
> very difficult to add handling of simple cases.  I am concerned though
> about difficulties handling potential ambiguity wrt the strandedness of
> relative coordinates.
> 
> I assume by relative coordinates here, you mean you're describing a
> feature's position in terms of the position of another feature which is
> itself described in absolute coordinates (or is relative to a feature
> which is).
> 
> -Allen
> 
> 
> 
> On Fri, 9 Jan 2004, Aaron J.Mackey wrote:
> 
> > Hi Scott,
> > 
> > Thanks for the quick reply, but that wasn't exactly the nature of the  
> > question; the question was whether (apart from Gap attributes), do  
> > gbrowse, BDGFF, and/or, specifically, load_gff.pl variants know the  
> > rest of GFF3, namely to provide the ability of input GFF3 with features  
> > that aren't in absolute reference coordinates, but in relative  
> > coordinates?  And is that ability in release 1.58, or some CVS branch I  
> > can access (code that lives quietly in the depths of Lincoln's hard  
> > drive doesn't count)?
> > 
> > Thanks,
> > 
> > -Aaron
> > 
> > On Jan 9, 2004, at 4:47 PM, Scott Cain wrote:
> > 
> > > OK, I am going to answer this, but if I am wrong, I'm sure Lincoln will
> > > correct me.  I don't think gbrowse or BDGFF knows how to deal with  
> > > cigar
> > > lines in Gap attributes yet.  It is safer for the moment to continue to
> > > put separate HSPs on separate GFF lines for the time being.
> > >
> > > Scott
> > >
> > >
> > > On Fri, 2004-01-09 at 16:42, Aaron J.Mackey wrote:
> > >> Forgive me for a stupid question, but does GBrowse (v1.58) now support
> > >> GFF3?  Namely, can I have start/stops in sub-feature coordinates in my
> > >> input GFF3 and expect bp_load_gff.pl to behave properly (i.e. generate
> > >> "canonical" top-level coordinates for storage)?  I didn't see anything
> > >> in the documentation, so I was surprised to see some of the words in
> > >> these posts ...
> > >>
> > >> Thanks
> > >>
> > >> On Jan 9, 2004, at 4:09 PM, Mark Wilkinson wrote:
> > >>
> > >>> Cool.  I'm heavily into making the HSP's output proper GFF3 today for
> > >>> some of the Gbrowse tools that I have been working on, so I will jump
> > >>> in
> > >>> and do this over the next day or two.
> > >>>
> > >>> Cheers!
> > >>>
> > >>> Mark
> > >>>
> > >>> On Fri, 2004-01-09 at 14:49, Scott Cain wrote:
> > >>>> I think everything you wrote below is correct.  As far as I know,  
> > >>>> only
> > >>>> Allen and I have been working BTGFF's GFF3 code, and we haven't
> > >>>> touched
> > >>>> the alignment portion, so I am not surprised that it is wrong.  I
> > >>>> suppose fixing BTGFF may break some tools, but I know that the chado
> > >>>> loader I wrote will handle it correctly :-)
> > >>>>
> > >>>> Thanks,
> > >>>> Scott
> > >>>>
> > >>>>
> > >>>> On Fri, 2004-01-09 at 15:45, Mark Wilkinson wrote:
> > >>>>> On Fri, 2004-01-09 at 11:22, Scott Cain wrote:
> > >>>>>
> > >>>>>>   - be sure to use a SO term for the type (ie, match or one of its
> > >>>>>> children)
> > >>>>>
> > >>>>> So... actually the existing implementation of GFF3 in bioperl
> > >>>>> from Bio::Tools::GFF->new(-gff_version => 3)
> > >>>>> does not generate correctly formatted GFF3 for alignment features,
> > >>>>> yeah?
> > >>>>>
> > >>>>> e.g. for column 9 of an alignment feature I get:
> > >>>>>
> > >>>>> 	Target=gi|2828774:54232..54206
> > >>>>>
> > >>>>> whereas I think I should be getting
> > >>>>>
> > >>>>> 	Target=gi|2828774+54232+54206
> > >>>>>
> > >>>>> In addition, it passes through all sorts of other tags that begin
> > >>>>> with
> > >>>>> capital letters:
> > >>>>>
> > >>>>> 	Bits=46.1;FracId=0.962962962962963
> > >>>>>
> > >>>>> these should be
> > >>>>>
> > >>>>> 	bits=46.1;fracId=0.962962962962963
> > >>>>>
> > >>>>> if I am reading the spec correctly.
> > >>>>>
> > >>>>> Finally, the column-3 term that comes out is "similarity", but it
> > >>>>> should be
> > >>>>> one of the *match terms.  Is that also correct?
> > >>>>>
> > >>>>> Please confirm that I am interpreting the GFF3 spec correctly for
> > >>>>> these
> > >>>>> Alignment features and I would be happy to go in and fix things
> > >>>>> (a.k.a. break
> > >>>>> everyone else's tools ;-) )
> > >>>>>
> > >>>>> Cheerio!
> > >>>>>
> > >>>>> Mark
> > >>>>>
> > >>> --  
> > >>> Mark Wilkinson <markw at illuminae.com>
> > >>> Illuminae
> > >>>
> > >>> _______________________________________________
> > >>> Bioperl-l mailing list
> > >>> Bioperl-l at portal.open-bio.org
> > >>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >>>
> > > --  
> > > ----------------------------------------------------------------------- 
> > > -
> > > Scott Cain, Ph. D.                                          
> > > cain at cshl.org
> > > GMOD Coordinator (http://www.gmod.org/)                      
> > > 216-392-3087
> > > Cold Spring Harbor Laboratory
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > 
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.org
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory



More information about the Bioperl-l mailing list