[Bioperl-l] Bio::*Taxonomy* changes
Sendu Bala
bix at sendu.me.uk
Mon Jul 24 18:15:31 EDT 2006
Chris Fields wrote:
>
> Also, I'm trying to follow the original idea as proposed by Jason (this is
> from perldoc Bio::Taxonomy::Node):
>
> Which, to me, indicated that this would eventually replace Bio::Species
Well, we don't really know that Jason didn't later change his mind, but
in any case it doesn't make sense (anymore, given that we have
Bio::Taxonomy).
In a direct reply to me you point out specific passages in the current
docs that explain why you have thought we should delegate or replace
Bio::Species with Bio::Taxonomy::Node. With respect, the old plans are
not something we are forced to blindly follow. We decide for ourselves
if they make sense, we decide for ourselves if there is a better way of
doing it, and then we do it the best way.
So if you ignore what those old bits of documentation say, just pretend
you never ever read them, would my proposals make sense or not? Since
those old proposals were never implemented we have no reason to try and
stick with them if there is a better proposal.
And for the record, '...Bio::Species which is able to represent only
species-level' can (correctly) be interpreted as 'Bio::Species is only
supposed to be used for representing a taxonomy that includes the
species-level'. You can't interpret it literally because Bio::Species is
used for levels below species, and also represents all the levels above
species-level as well. Either Jason got it wrong when he wrote that, or
you have misinterpreted it.
Likewise, let's play the interpretation game again: 'Previously all
information was managed by a single object called Bio::Species. [the
Bio::Taxonomy::Node] implementation allows representation of the
intermediate nodes not just the species nodes'. Note the apposition of
'single object' vs implication of multiple Node objects to do the same
job. I imagine at the time Jason wrote that there was no Bio::Taxonomy,
no holder for multiple Nodes.
> I had originally wanted to start delegating everything over to
> Taxonomy::Node about a month ago, when I found that it was remarkably easy
> to do so. However, when Sendu proposed making changes to remove methods in
> Bio::Taxonomy::Node and make sweeping changes to Taxonomy which would
> prevent an easy transition over to Node,
But an equally easy transition to Bio::Taxonomy instead. I don't know
why you would care about the name of the class we switch to. My concern
is that when the switch is made it makes sense.
> If we think it would be better to completely toss all this out the window
> and use only a bare-bones Node, then I'm fine with that. But if we go that
> route we should just get rid of the Bio::Species 'disease' completely and
> have things be much simpler. Simple is good!
>
> I think Node can still act as a viable container class for the tax data from
> a GenBank file (it's original purpose) as long as it has the very basic
> methods for doing so. That would require:
>
> scientific_name() - ORGANISM line data
> common_names() - which could hold common names (in parentheses on the SOURCE
> line) and the abbreviated name (from the SOURCE line)
> ncbi_taxid() - from the 'source' seqfeature (already there).
>
> The lineage information and organelle information could be stored in Node or
> in SimpleValue objects. My vote is for the latter as there's no need for a
> classification() container for Node, which you have repeatedly pointed out.
No, this is the whole point. The lineage information can NOT be stored
in a Node (unless you absuse Node by having all those crufty methods
like genus() and classification()), and why would we store it in
SimpleValue objects when we have Bio::Taxonomy?
Bio::Taxonomy is completely perfect for storing the taxonomic
information from a GenBank file. That's all you need to worry about. Can
we represent the data correctly? Yes. Do we gain all the good things
about a pure Bio::Taxonomy? Yes. Can we still do everything we used to
be able to do? Yes.
> I think we should just get rid of Bio::Species completely.
There's no need to get rid of Bio::Species. It can be a Bio::Taxonomy
with backward-compatible methods. No harm done, all good.
I'll tell you what. This will be easier if I just write the code for my
proposals, including whatever changes would be needed in
Bio::SeqIO::genbank et al. You'll see how easy and appropriate it is,
and hopefully everyone will be happy.
Perhaps you could just hold off doing any similar-but-contradictory work
until then.
More information about the Bioperl-l
mailing list