[BioRuby] Porting PhyloXML to Nokogiri, maybe repackaging it

Clayton Wheeler cswh at umich.edu
Tue May 15 13:08:20 EDT 2012


Hi all,

The PhyloXML unit tests are failing under JRuby, because the libxml-jruby gem (an implementation of the libxml API using native Java XML libraries) does not support the full API of libxml-ruby. My first approach to this was to simply use the native libxml-ruby gem and its C extension, which works with JRuby in 1.8 mode. However, it doesn't work in 1.9 mode due to a Unicode issue, and the JRuby developers indicate that the C extension API (as opposed to FFI, I suppose) isn't likely to be supported further in 1.9 mode. (see http://bit.ly/JGWC4K)

There was a discussion of the PhyloXML parser on the mailing list a couple of months ago (http://bit.ly/JFX8Qf), and Naohisa indicated that it might be rewritten to use Nokogiri at some point soon, since Nokogiri is now the de facto standard XML parser. Following that lead, I've gone ahead and ported the PhyloXML parser to use Nokogiri; it only took an hour or two, and the unit tests are passing. My branch for this is at https://github.com/csw/bioruby/tree/phyloxml-nokogiri. If this seems like a good approach, I can port the writer as well.

However, Pjotr suggested that it might make sense to split PhyloXML out into a separate gem. This should be straightforward enough, since no other BioRuby components appear to call PhyloXML. It would mean that any PhyloXML users would need to install a separate gem. On the other hand, it would remove a dependency on libxml2 for core BioRuby on MRI. Thoughts? Should I proceed with this approach?

Clayton Wheeler
cswh at umich.edu




More information about the BioRuby mailing list