[Biopython-dev] Pfam24/HMMER3 (and GO terms...)

Peter biopython at maubp.freeserve.co.uk
Mon Oct 19 17:29:26 UTC 2009


On Mon, Oct 19, 2009 at 6:18 PM, Kyle Ellrott <kellrott at gmail.com> wrote:
> Pfam24 was published last week ( http://pfam.sanger.ac.uk/ ) , it
> utilizes HMMER3 to do some rather fast HMM based protein
> identification (of about 11,912 families).  I've gotten an initial
> port of the PfamScan perl script found at
> ftp://ftp.sanger.ac.uk/pub/rdf/PfamScanBeta/ ported to BioPython.

Perhaps I have misunderstood you (and I have not looked at
the code yet), but have you just re-written the PFAM perl script
pfam_scan.pl in python? Is so, what is the aim? OK, it might be
a bit faster - but you would be duplicating the work of the PFAM
team and creating a long term maintenance burden.

I can see the value of having an HMMER3 output parser, and
a command line wrapper for calling it. This will be useful for
things outside of PFAM.

I can see the value of having a pfam_scan.pl output parser (XML,
CVS, or the possible JSON), and a command line wrapper for
calling it.

Peter




More information about the Biopython-dev mailing list