[Biopython-dev] Pfam24/HMMER3 (and GO terms...)

Kyle Ellrott kellrott at gmail.com
Mon Oct 19 17:18:03 UTC 2009


Pfam24 was published last week ( http://pfam.sanger.ac.uk/ ) , it
utilizes HMMER3 to do some rather fast HMM based protein
identification (of about 11,912 families).  I've gotten an initial
port of the PfamScan perl script found at
ftp://ftp.sanger.ac.uk/pub/rdf/PfamScanBeta/ ported to BioPython.
Currently the layout somewhat mirrors the Perl module layout, but that
can be evolved to be more 'pythonesque'.  The interface is not yet
done (it mainly works just to print out results, internal data
structures aren't very clear).  Thoughts and suggestions on how people
would use this in their Python Scripts would be helpful.

And in regards to the current GO conversation that is going on, there
is a table the connects Pfam families to GO terms (
ftp://ftp.sanger.ac.uk/pub/databases/Pfam/releases/Pfam24.0/database_files/gene_ontology.sql.gz
), so connecting this work to the suggested GO modules would probably
be beneficial.

You can find the work at http://github.com/kellrott/biopython/, under
the Bio.Pfam module.

Kyle



More information about the Biopython-dev mailing list