[Biopython-dev] pypaml

Peter Cock p.j.a.cock at googlemail.com
Fri Jan 14 07:28:30 EST 2011


On Fri, Jan 14, 2011 at 10:56 AM, Brandon Invergo <b.invergo at gmail.com> wrote:
> Hi everyone,
> New subscriber here, and hopefully a new contributer as well!
>

Hi Brandon,

Welcome to the list. By the way, apologies for mixing you and the PAML
author Ziheng Yang up last year (I misread the pypaml webpage):
http://lists.open-bio.org/pipermail/biopython/2010-September/006747.html

> I have written a Python interface to the CODEML program of the PAML
> package (http://abacus.gene.ucl.ac.uk/software/paml.html), with the
> intention of eventually covering all of the programs in the package.
> You can find my package here:
> http://code.google.com/p/pypaml/
>
> I recently ran across a discussion that occurred on the main Biopython
> list regarding my interface
> (http://lists.open-bio.org/pipermail/biopython/2010-September/006743.html)
> and I realized that perhaps it would be better if I integrated it into
> Biopython. I know that it's something many people would be interested
> in. I am very enthusiastic to continue this project and to do whatever
> I need to do to facilitate the integration.

That is great news :)

> Some immediate tasks that need to be done are:
> - change the licensing: currently it's GPL, as described in the code
> and on the project page. Is it sufficient to simply remove its
> dedicated project page and change the verbiage in the code?

Assuming you wrote all the code (or have your co-authors agreement),
then yes, you can just change the licence. If you want to you can
update the code in your repository and website, maybe make a new
release while you are at it. Alternatively, you could just leave the
standalone pypaml code as it is (under the GPL), but base your
Biopython contributions on it (under the Biopython MIT/BSD licence).

I would suggest that you don't make API changes to standalone
pypaml, so as not to disrupt your existing users. However some of
the work like Python 2.5 support might be worth doing there (before
looking at Biopython integration). As a bonus, that should also mean
you can use pypaml under Jython (Python on the JVM).

> - check coding standards as described in the Contributing to Biopython wiki
> - make some changes to be compatible with Python 2.5: I use @property
> and @x.setter decorator tags which are only 2.6+. I think that's the
> only incompatability

If so that doesn't sound too hard to update.

> - double-check the CODEML output parsing for many PAML versions; the
> output is notoriously non-standard from release to release. I may have
> to build some version-checking into the parser. I wrote it based on
> the output of PAML 4.3

>From Chris Field's comments last year, that may be a lot of work for
relatively little gain. I don't use PAML and have no idea what versions
are typically used though.
http://lists.open-bio.org/pipermail/biopython/2010-September/006760.html

> - build some unit tests (I'm new to this in Python so I need to learn
> a bit about that

We've tried to cover the basics in a chapter in our tutorial,
http://biopython.org/DIST/docs/tutorial/Tutorial.html
http://biopython.org/DIST/docs/tutorial/Tutorial.pdf

> - perhaps making it fit with any other structural standards in the
> Biopython library?
>
> I've tried from the start to make it very generalized so I don't think
> any major changes need to be made. Plus, I think structurally it
> should be easy to implement the other PAML programs by copying a lot
> of the code. The output parsing for each program is a different story,
> though.

Does that mean you have wrappers for calling the PAML command
line tools? Can you point me at the code for that - I'd like a quick
look to see if it makes sense to switch over to the Bio.Application
based system we're trying to standardise on in Biopython. On the
other hand, if you have a much higher level wrapper maybe it is
fine as it is (e.g. the Bio.PopGen wrappers follow their own route,
although they use Bio.Application for the low level API inside).

> So, as I understand it, I should file an enhancement bug over at the
> Bugzilla site.

That would be useful to give us a reference number for tracking it.
A lot of your email would make a good introduction to the issue to
put in the comment.

> In the meantime I can start working on some of the
> points listed above. I also need to refresh my memory of using git
> since I've gotten in the dirty habit of using svn (assuming this is
> all approved)! Is there anything else I need to do for now?

Doing your work on a github fork of the Biopython repository
would be great (although you may want to start with adding unit
tests or doing Python 2.5 changes within standalone pypaml).

Peter.


More information about the Biopython-dev mailing list