[Bioperl-l] Question about embl format
    Lincoln Stein 
    lstein at cshl.org
       
    Fri Apr 18 17:07:25 EDT 2003
    
    
  
The SO (sequence ontology) terms tend to be very long, although the most 
common ones have short synonyms that often (but not always) match the 
GenBank/EMBL feature table tags.  What I *could* do is to replace the SO type 
tags with their accession numbers (SO:XXXXXX) and place the full name in a 
qualifiers /note as you suggest.
This will make a deep change in the API where the primary_tag could be an 
ontology term object rather than a string.  The best way to ensure backward 
compatibility with other people's codes would be to override the string 
method in the ontology term object in order to produce the term label.
Or we could reserve this type of change to bioperl 2.
Lincoln
On Friday 18 April 2003 03:54 am, Ewan Birney wrote:
> On Thu, 17 Apr 2003, Lincoln Stein wrote:
> > OK, so what to do about primary_tags that are >= 15 letters, since
> > BioPerl doesn't enforce a size limit on primary_tags?  If I implement
> > truncation at the write_seq level, then we'll lose round-tripping.
>
> What about coming up with a shorter tag in the database? Or is that a bad
> idea.
>
> Don't really know what to do. We could have a convention about truncated
> the key and then have a
>
>   /note="key=very_long_key_string"
>
> in the qualifiers
-- 
========================================================================
Lincoln D. Stein                           Cold Spring Harbor Laboratory
lstein at cshl.org			                  Cold Spring Harbor, NY
========================================================================
    
    
More information about the Bioperl-l
mailing list