[BioPython] Questions & suggestions

Brad Chapman chapmanb at uga.edu
Sun Mar 21 12:46:05 EST 2004


Hey all;
Great discussion on things. I'll try to touch on all the points in
one e-mail; apologies for the length.

[Automated documentation generation]
Thomas:
> What are you thinking of using in the future? I must admit that the HappyDoc 
> requirements for generating good and readable descriptions are a bit of a 
> mystery to me.... Bio.PDB looks very ugly (my fault, probably). I'd sure like 
> to improve that (especially since Bio.PDB is actually pretty well commented 
> code).

Yes, I've not been a fan of HappyDoc for a while. I was pointed to,
and really like, epydoc. Please take a look at:

http://biopython.org/docs/api/private/trees.html

and let me know what you think. I am a big fan of the new output, and we
can just pull out of the text documentation of modules, classes and
functions without having to try and format it up in some pretty way.
I made a number of small modifications to the docs to get them to 
look nicer under this system.

I'd like to stick with epydoc unless people have objections. I added
some documentation to the end of the contributing guidelines to
describe the simple things you can do to make your modules, classes
and functions be maximally useful with epydoc:

http://biopython.org/docs/developer/contrib.html

[Non-automated documentation (someone has to write it style)]

Iddo:
> Regarding the documentation: how about adopting two models to help keep 
> it up-to-date:
> 
> 1) CVS the Biopython Book. In that manner, it will be easy for people to 
> insert fixes/updates new entries, etc. etc. See the plone book
> 
> http://plone.org/documentation

The documentation is in CVS -- Docs/Tutorial.tex and Docs/cookbook
for all the new cookbook stuff (one directory per example there).

As far as getting a framework like Plone in place, I honestly am not
sure I am really for that. I do think it is a good idea, but our
attempts at the Wiki in the past have really soured me on "fancier"
ways to generate documentation.

Really, what I'd like to see is contributions from people in the new
cookbook style. This requires no need to learn any type of system --
I'm happy to accept docs in plain text, html, pdf -- anything that
will be viewable on the web. So people can write documentation
however they feel comfortable.

> 2) Online comments from users, like in the Zope or MySQL manuals. that 
> would be helpful in identifying glaring gaps in the docs.

This is a good idea, but also along the lines of my biases against
trying to be fancier than we need to be. Honestly, the user bases of
Zope or MySQL dwarf those of Biopython (although we are catching up
fast :-) and I don't want to put the cart before the horse (or
however that cliche goes).

But those are just my opinions -- I can always be convinced
otherwise :-).

[Removal/Deprecation of modules]
Thomas:
> We could make a list of modules that will be potentially removed, post it to 
> the biopython list, and then actually remove them when no-one objects. Is 
> anybody using the two HMMs (HMM and MarkovModel) for instance? Or the 
> support vector machine (SVM) and NeuralNetwork modules? 

Is potential non-use (or trying to assess non-use) really a good 
model to remove modules? If they work and are decently coded then I
think they have a potential use -- I definitely do know that a lot of the
different supervised learning methods are useful to people doing
clustering of literature (which is what I'm pretty positive Jeff
worked on for his thesis).

If things don't work, or are duplicated, then I'm in favor of trying
to get rid of that, but working code seems useful to me.

Thomas:
> The xKMeans, 
> KNN and KMeans clustering modules also seem to be obsolete in view of Michiel 
> de Hoons clustering module. 

Michiel:
> The xKMeans and KMeans can be considered obsolete, as they are included in 
> Bio.Cluster. The KNN and other modules under Bio/Tools/Classification are 
> currently not obsolete, as they contain supervised learning methods, which 
> are not included in Bio.Cluster.

If things are duplicated then the right thing to do is to remove the
duplication. I'd like to consider two things, though:

1. I'd like Jeff to chime in since these are his modules (I think).
I don't have enough knowledge about clustering to know if
Bio.Cluster also does the things that he needed his code to do.

2. We want to make sure to be careful about back-compatibility. If
we decide to remove things, I'd like to first have them raise
DeprecationWarnings for a couple of releases so that people have
time to change their code -- and also have some quick docs about how
to change from new to old. Breaking code is bad, and I want to make
it as easy as possible for people to keep up with changes.

Thomas:
> GA is a genetic algorithm framework and NeuralNetwork 
> is a neural network (which seems to have some special features to deal with 
> genes as input). They are both potentially interesting (providing that they 
> actually work) but it's a complete mystery how they are to be used. Does 
> anybody know who implemented these modules?

Yup, I did. I don't think they are perfect by any means, but they
should still work (all of the tests still pass, at least) and be
useful. I honestly don't use them much myself anymore since I
finished the project I used them for. But, they do need
documentation.

Michiel:
> After cleaning up the modules, it may be a good idea to set up some kind of 
> unified way to deal with gene expression data in Biopython.

Definitely -- I would welcome this. I nominate you to be in charge
:-).

[Miscellaneous bits]

Me:
> > No, not that I know of. Honestly, I am not a big fan of
> > BioPerl/Biopython comparisons just as I'm not a huge fan of
> > Perl/Python comparisons 
Thomas:
> I agree, but still, it would be handy to have a page that lists BioPerl and 
> Biopython features so that people could decide what they want to use for a 
> certain purpose. 

I agree. I'd be happy to accept a document like this :-).

Thanks again for everyone's comments!
Brad


More information about the BioPython mailing list