[Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing taxon entries in lineage

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Wed Apr 2 10:56:24 UTC 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2475





------- Comment #11 from ericgibert at yahoo.fr  2008-04-02 06:56 EST -------
Ok to have the code in Loader.py.
When a SeqRecord is to be added, we create a Taxonomy instance. I will add a
function to return a copy of the _NCBI_lineage list.
Then from "top" to "bottom", check if the taxon exists, if not, add it, until
the species itself (there ensure that the parent_taxon_id is well populated).

By default, we assume that taxon_id == NCBI_taxon_id. If this is not the case,
do I raise an error or "fall to plan B" and let the database to auto assign the
taxon_id?

On missing point: the left and right value. Do you know what to do? I have run
the Perl script on a test database and plan to look into the created records to
clarify it... but you can save me the effort if you already know their logic.

PS: because my original script was only updating the partial records created by
the previous algorithm of Loader, I need to rewrite it. Maybe 2 man.day.


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list