[Biopython-dev] HMMER (+ BLAT) wrappers

Wibowo Arindrarto w.arindrarto at gmail.com
Wed May 2 08:17:19 UTC 2012


Hi everyone,

The past week I've been trying to generate some test cases for BLAST,
HMMER, et al. I was writing some short scripts to automate the test
case generation, when I realized that Biopython doesn't have wrappers
for HMMER and BLAT, so I decided to write them. The code is here:
https://github.com/bow/gsoc/blob/master/hmmer/_HMMER.py and here:
https://github.com/bow/gsoc/blob/master/blat/_BLAT.py.

If it is of general interest to Biopython, I'd love to submit a pull
request for these wrappers. They were primarily written for test case
generation, but I imagine they won't require that many tweaks to make
it suitable for inclusion in Biopython. However, before I can do that,
there are some issues that I think needs to be discussed:

1. Where should the wrappers be put? I noticed that different wrappers
are located in different directories according to their 'theme' (e.g.
BLAST wrappers in Bio.Blast.Applications and ClustalW wrapper in
Bio.Align.Applications). For the HMMER wrapper, should it be put
inside Bio.Motif.Applications? For the BLAT wrapper, should I create a
new Bio.Blat folder just for it? Yesterday I thought maybe it would be
easier if all application wrappers are put inside the same directory
(e.g. all in Bio.Applications), so maybe that's a viable option for
future releases?

2. How should shared options among slightly different programs be
handled? We can rely on creating abstract subclasses for them, but I
find it easier to simply create lists and then combine them in the
different programs. The current HMMER wrapper employs both of these
approaches, but I think it needs to stick to just one approach to make
the code easier to understand.

3. Is there a convention for naming the command line arguments? For
example, if the command line option trigger is '--domE', should I name
the Python variable, for example, 'domE', 'dome', 'dom_e', or 'dom_E'?

4. For the HMMER wrapper, there are some flags that are exclusive to
each other (i.e. the user can only choose one of the flags). If the
user chooses both, HMMER doesn't show any error messages ~ but nothing
is run. Should the wrapper check for such mutually exclusive flags
when it's created as well?

5. For BLAT, the installed suite includes a program that runs a BLAT
server to handle search requests from different clients. It doesn't
seem to be a typical program that should be wrapped by Biopython, but
I might be wrong. Should a wrapper for the server be included as well?

cheers,
Bow



More information about the Biopython-dev mailing list