[Bioperl-l] Re: *major* error in genbank parser or am i just insane?

Brian King brian.king@animorphics.net
Fri, 9 Aug 2002 00:06:54 -0700 (PDT)


> But is this just random cruft from Genbank/EMBL that
> they didn't 
> realise
> when they designed it or something deeper?

After long struggles with the join operator I finally
concluded is that it's just a way to represent
hierarchical features in the flat feature table
structure.  The regions within the join usually
correspond to some other contiguous feature in the
same feature table.  I'm interested to know if someone
with more experience than me sees it the same way.  

Because of the ambiguities in the join operator my
ideal solution would be to not support the join syntax
at all, but to match up the joined feature with its
intended sub-features in the same table when parsing,
or at least create generic sub-features at the
contiguous regions on the join.  I'd make a real
hierarchical representation in the object model and
abandon the join syntax.  Unfortunately you'd have to
hard-code some biological knowledge to judge if a
corresponding sub-feature was really supposed to be
part of a joined feature.  I doubt that round-trip
preservation of the GenBank/EMBL record is necessary. 
You could write out the record in a format that has
hierarchical features and refer to the original record
as needed.  Anyway, all that would be pretty hard to
do, but I like to have an ideal in mind anyway.

Sorry I only have an analysis and no solution.

Regards,
Brian






__________________________________________________
Do You Yahoo!?
HotJobs - Search Thousands of New Jobs
http://www.hotjobs.com