[BioRuby] GSOC: phyloXML for BioRuby: Code Review
Christian M Zmasek
czmasek at burnham.org
Tue Jun 23 19:29:42 UTC 2009
Hi, Diana:
Diana Jaunzeikare wrote:
> Hi all,
>
> In the Google Summer of Code project I have reached a stage where most
> of the code has been written for PhyloXML parser and I would like to
> ask for code review.
>
> I would like to know answers to these questions:
>
> * What parts should have more documentation?
Node might benefit from a more detailed description of all its (sub-)
elements.
Also, you don't always use the "rdoc" format.
Some documentation sare hard to read (such as the one for Sequence),
simply because of the way the text is formatted.
In general, I would point out the dependency on libxml2 more prominently.
>
> * Are there any places where code could be made more rubyish?
Maybe core-BioRuby developers can give an answer for this one. Looks
like Ruby to me, but I started off programing with C++ and Java -- so, I
might be biased ;)
>
> * Are the structure of unit tests fine, or there are some conventions
> which my code doesn't follow?
I think it would be best to add more tests for "marginal"/error cases
(for parsing).
Listed in increasing severity:
Are empty elements handled properly (e.g. <taxonomy></taxonomy>)?
What about new-lines, tabs, non-printable ascii characters in place
where text is expected? Trailing and leading whitespaces? Does this get
trimmed of?
Valid XML documents violating phyloXML specs?
Invalid XML?
All these should be handled gracefully.
>
> * Is code readable?
Yes.
> * Are there any conventions that I don't follow? (like lines should
> strictly fit into 80 columns)?
>
> Any comments would be appreciated.
>
> Code is available on github
> http://github.com/latvianlinuxgirl/bioruby/tree/dev in
> *lib/bio/db/phyloxml.rb* and *test/unit/bio/db/test_phyloxml.rb* files.
>
>
> Diana
>
Furthermore, it might be a good time to start testing your
parser/objects against really large files.
This might help to uncover potential hidden problems.
Obviously, you could not add such large files to BioRuby's test files.
But it would still be nice to know how your parser and objects scale....
Also, I am not sure if it's such a great idea to have all your classes
in the same file/directory (i.e. both parser _and_ data objects).
Right now, if the libxml2 gem is not install the test for the whole of
bioruby exits.
Christian
More information about the BioRuby
mailing list