[Bioperl-l] Re: How to express 'histogram' data in GFF3

Matthew Vaughn vaughn at cshl.org
Fri Mar 25 07:39:36 EST 2005


I posted a question about this a few days ago and have worked out what 
appears to be a definitive answer, thanks to some advice from Scott 
Cain. I thought I'd share what appears to work with BioPerl 1.5 and 
Gbrowse 1.62.

For a given bit of histogram-type data, proper GFF2 formatting was as 
follows:

ChrII	fwd	chip1	0	100	45.4	+	.	chip1 ChrII:fwd

Contrast this with GFF3 format for the same data point

ChrII	fwd	chip1	0	100	45.4	+	.	ID=chip1:ChrII:fwd

Basically, I merged what used to be the group field into an ID tag. 
Technically, the ':' character should be HTML-escaped, leaving the ID 
tag like so

ChrII	fwd	chip1	0	100	45.4	+	.	ID=chip1%3AChrII%3Afwd

Does the fact the ID is not unique violate the GFF3 spec? That's a 
tough question that I leave to the experts.

The gbrowse configuration file aggregators for GFF2 and GFF3 are the 
same, in this case:

aggregators = agg1{chip1:fwd}

Scott suggested that I might need to create a region feature, then 
assign my histogram data points to it as children using the new Parent 
attribute of GFF3. However, it appears that the custom aggregator takes 
care of this. Clicking on the histogram in my current genome browser 
yields a gbrowse_detail page with all the histogram data points within 
the currently displayed span of coordinates.

--
Matthew W. Vaughn, Ph.D.
Cold Spring Harbor Laboratory
Delbruck Laboratory / Martienssen Group
1 Bungtown Road
Cold Spring Harbor, NY 11724

phone: (516) 367-8469
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2359 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20050325/3c0e4b5c/smime.bin


More information about the Bioperl-l mailing list