[Biopython-dev] [Bug 1885] KEGG Compound db format changes

Tue Nov 1 16:31:21 EST 2005

http://bugzilla.open-bio.org/show_bug.cgi?id=1885

------- Comment #4 from edmonds at fas.harvard.edu  2005-11-01 16:31 -------
(In reply to comment #3)
> How did you download the new test cases for KEGG compound? Are the existing
> test cases in Tests/KEGG no longer valid? The submitted patch causes
> test_KEGG.py to fail, but I'm not sure if that is due to a bug in the patch or
> whether the existing test cases don't satisfy the current KEGG standard.
> 

The entire KEGG database can be downloaded at
http://www.genome.ad.jp/kegg/kegg5.html , so I took some test cases from there. 

There are two features of the existing test cases that do not resemble how the
entries are currently formatted:

In the past, the entry line used to have only the compound ID.  Now the group
the ligand belongs to is also named.  So on the right side of that line, it now
says "Compound" or "Drug" or "Glycan", ...  All the entries in the database
have that now, so I don't think it makes sense to make it optional just to
accommodate the old test cases.  

In the past, the formula line could come right after the name block or
somewhere at the end of the entry.  Now all formula lines come right after the
name block.  

Changing these two features of compound.sample and compound.irregular causes
test_KEGG.py not to fail.  

as an aside, neither what I submitted nor the original works for the glycan or
reaction parts of the ligand database, and I suspect that they also don't work
properly for the enzyme part of the database.  

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.