[Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing taxon entries in lineage

Mon Apr 7 12:59:31 UTC 2008

http://bugzilla.open-bio.org/show_bug.cgi?id=2475

------- Comment #18 from biopython-bugzilla at maubp.freeserve.co.uk  2008-04-07 08:59 EST -------
Michiel,

It seems that Bio.Entrez.efetch can return XML files containing one record or
many records, e.g.

taxon_id_list = ['488050', '447868', '333459', '126256']
taxon_handle = Bio.Entrez.efetch(db="taxonomy", id=taxon_id_list,
retmode="XML")
#This handle contains four Taxon entries

taxon_handle = Bio.Entrez.efetch(db="taxonomy", id='488050', retmode="XML")
#This handle contains one Taxon entry

Bio.Entrez.read(taxon_handle) will return a list of dictionaries (one for each
taxon ID supplied).  We've established a convention of sorts about "read()"
versus "parse()", the first returns a single record and the second a record
iterator.

If a taxon single entry (currently held as a dictionary) is regarded as a
record, then should Bio.Entrez.read() be called Bio.Entrez.parse() instead?  I
am also wondering if we should create simple record classes for the different
XML data types (instead of using dictionaries).

-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.