[Bioperl-l] Re: [Bioperl-announce-l] an extension to Bio::SeqIO

Chris Mungall cjm at fruitfly.org
Wed Jun 18 10:15:44 EDT 2003



On Wed, 18 Jun 2003, Peili Zhang wrote:

> >>
> >> sequence data in any (rich) formats
> >> 	 |
> >> 	 | via Bio::SeqIO
> >> 	 v
> >     Bio::Seq->get_SeqFeatures()
> >         OR
> >      Collection of Bio::SeqFeatureI
> >
> >[Actually what's cooler I think is that you don't need Bio::Seq objects or
> >anything, just a set of Bio::SeqFeatureI objects. This would mean that
> >people could take their GFF files and turn them into chado IFF they are
> >rich enough.]
> >
>
> we do want the Bio::Seq objects. for instance, if the Bio::Seq object is a gene,
> we'll want to create a top-level feature of type 'gene' for it in chado, as well
> as loading in its references as feature_pubs, accessions as feature_dbxrefs,
> comments/descriptions/others as featureprops. its gene model features
> (transcripts, exons, CDS's) will hang off of the top-level feature.

Hi Peili

Can you give an example of where a Bio::Seq object is created for a gene?
If these are coming from genbank, the Bio::Seq corresponds to either the
srcfeature (if it is a genomic DNA record) or to a transcript (if it is an
mRNA record)

I agree with Jason that we will mostly be populating from Bio::SeqFeatureI
objects; GFF3 is actually quite a nice match for chado.

By the way, where does GFF fit into the *IO framework? Right now it's a
Bio::Tools thing. Will there be a Bio::SeqFeatureIO?

> -peili
>
>



More information about the Bioperl-l mailing list