[Biopython] MOODS: fast search for position weight matrix matches in DNA sequences.

Brad Chapman chapmanb at 50mail.com
Thu Sep 24 12:27:30 UTC 2009


Peter and Bartek;

[MOODS paper compared with Biopython]
> > I was even thinking about
> > incorporating their code into Biopython, but it's GPL, Instead, I can
> > make the function using Michiel's code aware of the MOODS package:
> > i.e. use it if it is installed.

It may be worth contacting the authors with your interest in
incorporating it. If it improves substantially upon the current C
code from Michiel and could fit with your interface this makes
sense. Many times people are not tied to GPL, and they may be
willing to re-license for inclusion in Biopython.

> > If we want to put it into the news, It would be worth mentioning that
> > (thanks to Michiel) we have made quite some progress on that front.
> 
> Good idea - why don't you check in an extra paragraph to the NEWS
> file section for Biopython 1.51 (or was it 1.52?). We can also update
> the news post too. In fact, if you wanted to you could write up a whole
> blog post to put up on our news server with timing etc.

A separate news post mentioning the C option speed and showing usage
examples from both is a great idea. Responsiveness to new methods is
the fun part of science.

> > As a side note, I feel a little bit guilty of making biopython look
> > slow compared to other tools. In the paper, they show a comparison
> > between different tools (MOODS, bioperl, biopython) in terms of speed,
> > which shows biopython as by far the slowest. This is just because I
> > was not writing this  code with speed in mind (I work on short
> > regulatory sequences...). Nonetheless, it can make an impression that
> > biopython is slow in general, which is not true. 

This is more a consequence of how scientific publication works. You have
to get published and to do that you have to prove you are somehow that
much better than other options, which results in trying to find flaws in
those options. This would all work smoother if the authors came on the
Biopython list mentioning the speed issues, you all had this discussion
then and worked on incorporating their code as it was being developed.
Then we'd have an integrated implementation today. Doing it after the
fact is a bit more roundabout, but what can you do.

Brad



More information about the Biopython mailing list