[Bioperl-l] SGD GFF3 file available soon

Scott Cain cain at cshl.org
Wed Feb 18 22:14:58 EST 2004


Stan,

In your sample GFF, the seqid in the first column has to correspond to
some ID, usually also defined in the same GFF file.  For instance, if
the features in the GFF file are all on chromosome I, the first column
of all of those lines would have the same ID as the ID declared for
chromosome I.  For example:

I	SGD	chromosome	1	230211	.	.	.	ID=I;description=Sequence "I"
I	SGD	telomere	1	801	.	-	0	ID=TEL01L;description=I left telomeric region;db_xref=SGD:S0028862
I	SGD	repeat_family	1	62	.	-	0	ID=TEL01L-TR;name=Telomeric Repeat;description=I left telomere TG(1-3);db_xref=SGD:S0028864
...etc...

Sorry I didn't point that out before--when I looked at the Excel sheet
you sent me before, I didn't see all of it (I am too used to working
with plain text files).

Scott

-------------Original Message---------------
> Date: Wed, 18 Feb 2004 14:09:27 -0800
> From: Stan Dong <qdong at genome.stanford.edu>
> Subject: [Bioperl-l] SGD GFF3 file available soon
> To: bioperl-l at bioperl.org
> Message-ID: <1DE37948-625F-11D8-89C8-000A956A0A36 at genome.stanford.edu>
> Content-Type: text/plain; charset=US-ASCII; format=flowed
> 
> Hi,
> 
> I am a programmer at Saccharomyces Genome Database ( SGD, 
> http://www.yeastgenome.org/ ). I am working on developing a flat file 
> in GFF3 format ( http://song.sourceforge.net/gff3-jan04.shtml ) to 
> represent sequence features of yeast genome and it will soon be 
> released on our ftp site. This is very useful because quite a few open 
> source softwares can take this file format as input such as Gbrowse, 
> Chado etc.
> 
> I would like comments from people who are interested in doing similar 
> things and those who have good/not-so-good experience on GFF3 to share 
> with. For me, it took a while to get the specification done especially 
> make the third column (type) fully compatible with Sequence Ontology 
> (SO). One thing I liked about GFF3 is the last column (attributes) 
> where you can put all kinds of useful information such as in our case 
> GO annotation and a nice description of a feature. An example file of 
> SGD GFF3 can be viewed here.
> 
> ftp://genome-ftp.stanford.edu/pub/people/curator/GFF3Example.txt
> 
> Thanks,
> 
> Stan Dong
> Programmer, SGD

-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.org
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory



More information about the Bioperl-l mailing list