[Bioperl-l] The Feature changes which have broken compatibility

Allen Day allenday at ucla.edu
Wed Feb 2 18:02:49 EST 2005


you're right -- it's not in cvs.  i'll track down the changes and commit 
soon.  could be a few days, i'm in the middle of moving right now.

-allen


On Wed, 2 Feb 2005, Chris Mungall wrote:

> 
> 
> On Wed, 2 Feb 2005, Allen Day wrote:
> 
> > > I'm not sure how many people are aware how this works - here is what
> > > happens under the hood:
> > >
> > > 1) Bio::FeatureIO::gff is initialized. This entails establishing a
> > > connection to sourceforge to download the sequence ontology. This of
> > > course will not work if you are offline. Even if you are online, it
> > > doesn't seem to work for me and is of course dependent on the vagaries of
> > > whether the sourceforge & the sourceforge mirror is working. Even if this
> > > all works, there is an initial start-up lag which may be unacceptable to
> > > some applications. Also, not everyone using bioperl is in a country with
> > > fast internet access and local-ish sourceforge mirror.
> > >
> > > In addition, it hardcodes metadata about the ontology in bioperl (see
> > > Bio::Ontology::DocumentRegistry) which is asking for trouble.
> > >
> > > In addition, it downloads the ontology in a legacy deprecated format,
> > > because that's all bioperl currently supports. Also asking for trouble
> > > further down the line
> > >
> > > Why is it doing all this? Purely in order to check that the feature types
> > > provided in the GFF file are valid SOFA terms. Look, I already know all
> > > the GFF3 files I want to parse have valid SOFA types. If I want to check,
> > > I'll do this myself thanks, I don't want bioperl to secretly do it for me
> > > in a hokey way that requires me being online and in the USA, every single
> > > time I parse a file.
> > >
> > > In fact, there is already a script for validating a GFF3 file, in the SO
> > > software repository (which uses Bio::Tools::GFF) which does a much more
> > > thorough job, checking feature parentage too.
> > >
> > > What happened to modularity?  You know, parsing in a parser, verification
> > > in a verifier.
> > >
> > > 2) it starts parsing features, assigning Bio::Ontology::Term objects to
> > > each feature (the type). This entails having Graph::Directed, which is
> > > what Jason is alluding to. Not that bad in itself, but unneccessary for
> > > the majority of apps that just want to parse GFF
> > >
> > > Is it just me that thinks this is madness? Can someone please make it
> > > stop?
> >
> > Correct, but this behavior is disabled by default.  From the
> > FeatureIO/gff.pm POD:
> >
> >   my $featureOut = Bio::FeatureIO->new(-format => 'gff',
> >     -version => 3,
> >     -fh => \*STDOUT,
> >     -validate_terms => 1, #boolean. validate ontology
> >                           #terms online?  default 0 (false).
> >   );
> >
> > If you don't turn this on, it merely creates a
> > Bio::Annotation::OntologyTerm object with the identifer or term name from
> > the GFF file -- no validation attempted.
> 
> This doesn't seem to be the case - are you sure you have this code checked
> in?
> 
> > Furthermore, if you do want to validate against the SO/SOFA ontologies,
> > but you don't want to rely on the live ontologies on the web, you can
> > parse SO/SOFA from local files (in deprecated format, admittedly, but this
> > isn't my doing) first.  That fills the Bio::Ontology cache so network
> > queries don't happen.
> 
> Even once you check in your changes so this is no longer the default
> behaviour, I still strongly believe that this should be moved out of the
> parser altogether
> 
> > -Allen
> >
> 


More information about the Bioperl-l mailing list