[Biopython-dev] hmmpfam parser

Brad Chapman chapmanb at uga.edu
Mon Feb 2 22:54:52 EST 2004


Hi Wagied;

> I have some code which is able to parse hmmer output,
> as well as code donated by Joanne Adamkewicz from Exilexis.
> 
> If you guys/gals find it useful, updates and modification will be done!

Thanks for sending this -- hmmpfam parsing code in Biopython is
definitely something we need. A few notes on what you sent:

1. I'm guessing that PfamParser.py and ExPFam.py are completely
separate pieces of code (except for both dealing with parsing Pfam).
For Biopython, the PfamParser.py is the more generally useful piece
of code since it provides an interface to parse a hmmpfam result
into a record-like object. So I'll probably restrict my comments to
that code.

2. Is there an methodology that you use to iterate over a file full
of hmmpfam results? Normally most parsers in Biopython include a
parser for individual records and then an iterator so that you can
apply the parser to a file full of results.

3. Some of the code does not follow the naming conventions that we
normally use in Biopython. Specifically:

a. Functions should be lowercase_separated_by_underscores style.

b. Variables should be lowercase_underscores style or alltogether
style. One of the things which was confusing to me in your code is
that you alternate between the lowercase_underscores style and
ALL_UPPERCASE style. At least in my experience ALL_UPPERCASE is
normally reserved for "constants."

c. You provide a lot of accessor methods for class variables (ie.
getAccession for self.accession). Normally in python you just have
access to the variable directly (or preface it with an underscore
like self._internal if the variable is for internal class use) --
the getWhatever functions is more java-like.

d. There are lots of unnecessary semi-colons in the code. They don't
hurt anything, but again make the code look more Java-like than
python-like.

e. On the class __init__()'s you have code that looks like:

def __init__(self, variable = None):
    if variable is not(None):
        # do something with variable
    else:
        # raise an error

You can eliminate all of this by just requiring the variable in the
initializer:

def __init__(self, variable):
    # do something with variable

And let python take care of the error checking that something was
passed.

Generally, the documentation on contributing to Biopython talks more
about style issues we try to stick to; so that a heterogeneous
project such as this can be as uniform as possible:

http://biopython.org/docs/developer/contrib.html

Hopefully all that is helpful -- we'd be very happy to accept the
code with some modifications along the lines of what I've mentioned
above, so I'm definitely not trying to be discouraging by
enumerating those points above. We just want to make sure the code
that gets in is as easy to understand and maintain as possible.

Thanks again for the mail and please don't hesitate to ask any other
questions!
Brad



More information about the Biopython-dev mailing list