[Bioperl-l] Unigene proposal and basic implementation

Tue, 16 Apr 2002 20:21:49 +1200

Hi Ewan,

>does the unigene file come with any actual sequence, or is it just
>clusters of IDs. And are those IDs dbEST ids?

It doesn't have the actual sequence but is a collection of IDs, in this case
accession numbers for Genbank/EMBL.

>BTW - thinking about it, I suspect that we should have the following
>abstraction/namespace
>
>   Bio::Seq::ClusterI
>      : methods attached to ClusterI which mainly have
>      ->seq_ids(); # primary accession of sequences clustered
>      ->annotation(); # Bio::Annotation::Collection associated with the
>cluster
>and I think the name space for unigene is probably best not top level but
>instead
>   Bio::Seq::Unigene; # inheriets from Bio::Seq::ClusterI
>   Bio::Seq::UnigeneIO;

At the moment at the top level I have:
Bio::UniGene.pm
Bio::UniGeneIO.pm
Bio::UniGene::unigene.pm

Would it work to make a Cluster namespace like so:
Bio::Cluster::ClusterI
Bio::Cluster::UniGene.pm
Bio::Cluster::UniGeneIO.pm
Bio::Cluster::UniGene::unigene.pm

Could this then be used for clusters of other types, less directly related to
Seq. i.e. perhaps Homologene? which is a kind of cluster of Unigenes (am I
right there??). I just think looking from the sequence up it is logical to put
Unigene in Seq but perhaps looking from Unigene down it might be better in
something like Cluster? As I've said I'm fairly new to the layout of bioperl,
so I don't know what discussions of this nature have gone before...

Cheers, Andrew.