[Biopython-dev] Need some help with SearchIO HSPs cascading attributes.
Kai Blin
kai.blin at biotech.uni-tuebingen.de
Wed Dec 5 07:24:14 UTC 2012
Hi folks,
I'm trying to finally get my hmmer2-text parser in, but I'm failing one
unit test. The code is a bit too smart for me, it seems.
So in the file I'm parsing, I only ever get the description of the hit
in the hit table, like this (appologies if my mail client breaks this):
Model Description Score E-value N
-------- ----------- ----- ------- ---
Glu_synthase Conserved region in glutamate synthas 858.6 3.6e-255 2
But of course I can't create a hit object when parsing the hit table, as
I first need to have HSPFragments to create the hit object with.
Anyway, I create a placeholder hit object that I'll later convert into a
real Hit object. In that placeholder object, I set a description.
Now I'm parsing the HSP table, looking like this:
Model Domain seq-f seq-t hmm-f hmm-t score E-value
-------- ------- ----- ----- ----- ----- ----- -------
GATase_2 1/1 34 404 .. 1 385 [] 731.8 3.9e-226
The HSP table is in a different order than the hit table, so never mind
the different model name.
Now, I need to create an HSPFragment with the same description as the
Hit object, or querying for the Hit object's description will cascade
through the HSPs and HSPFragments, and return multiple values for the
description.
However, no matter what I do, I seem to get an <unknown description>
tossed in there somehow.
The parser is at
https://github.com/kblin/biopython/blob/antismash/Bio/SearchIO/HmmerIO/hmmer2_text.py
the test code is at
https://github.com/kblin/biopython/blob/antismash/Tests/test_SearchIO_hmmer2_text.py
and the test file that's failing is the hmmpfam2.3 file at
https://github.com/kblin/biopython/blob/antismash/Tests/Hmmer/text_23_hmmpfam_001.out
Any pointers would be appreciated. The code is working fine in my
current development work in general, and I'd love to get it upstream to
get rid of an extra patch step during installation.
Cheers,
Kai
--
Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de
Institute for Microbiology and Infection Medicine
Division of Microbiology/Biotechnology
Eberhard-Karls-University of Tübingen
Auf der Morgenstelle 28 Phone : ++49 7071 29-78841
D-72076 Tübingen Fax : ++49 7071 29-5979
Deutschland
Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben
More information about the Biopython-dev
mailing list