[Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing taxon entries in lineage

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Wed Apr 2 05:45:14 EDT 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2475





------- Comment #10 from biopython-bugzilla at maubp.freeserve.co.uk  2008-04-02 05:45 EST -------
For the organisation of the code, what I had in mind was a general purpose XML
parser in Bio.Entrez.Taxonomy (with nothing to do with BioSQL), which would be
called from an updated BioSQL.Loader to parse a handle to the XML data fetched
using Bio.Entrez.efetch().

When adding a new SeqRecord to the BioSQL datanase, we would start with its
NCBI taxon ID, and assuming its not already in the database, go online to find
the parent taxon ID, and repeat until we match the ID of an existing taxon
record in the database (or get to the root node).  And then add all the new
taxon records to the database.  [I hope this is roughly the process you had in
mind Eric]


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


More information about the Biopython-dev mailing list