[Biopython-dev] Need some help with SearchIO HSPs cascading attributes.

Kai Blin kai.blin at biotech.uni-tuebingen.de
Wed Dec 5 02:24:14 EST 2012


Hi folks,

I'm trying to finally get my hmmer2-text parser in, but I'm failing one
unit test. The code is a bit too smart for me, it seems.

So in the file I'm parsing, I only ever get the description of the hit
in the hit table, like this (appologies if my mail client breaks this):

Model           Description                             Score    E-value  N
--------        -----------                             -----    ------- ---
Glu_synthase    Conserved region in glutamate synthas   858.6   3.6e-255   2


But of course I can't create a hit object when parsing the hit table, as
I first need to have HSPFragments to create the hit object with.

Anyway, I create a placeholder hit object that I'll later convert into a
real Hit object. In that placeholder object, I set a description.

Now I'm parsing the HSP table, looking like this:

Model           Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
--------        ------- ----- -----    ----- -----      -----  -------
GATase_2          1/1      34   404 ..     1   385 []   731.8 3.9e-226

The HSP table is in a different order than the hit table, so never mind
the different model name.

Now, I need to create an HSPFragment with the same description as the
Hit object, or querying for the Hit object's description will cascade
through the HSPs and HSPFragments, and return multiple values for the
description.

However, no matter what I do, I seem to get an <unknown description>
tossed in there somehow.

The parser is at
https://github.com/kblin/biopython/blob/antismash/Bio/SearchIO/HmmerIO/hmmer2_text.py
the test code is at
https://github.com/kblin/biopython/blob/antismash/Tests/test_SearchIO_hmmer2_text.py
and the test file that's failing is the hmmpfam2.3 file at
https://github.com/kblin/biopython/blob/antismash/Tests/Hmmer/text_23_hmmpfam_001.out

Any pointers would be appreciated. The code is working fine in my
current development work in general, and I'd love to get it upstream to
get rid of an extra patch step during installation.

Cheers,
Kai

-- 
Dipl.-Inform. Kai Blin         kai.blin at biotech.uni-tuebingen.de
Institute for Microbiology and Infection Medicine
Division of Microbiology/Biotechnology
Eberhard-Karls-University of Tübingen
Auf der Morgenstelle 28                 Phone : ++49 7071 29-78841
D-72076 Tübingen                        Fax :   ++49 7071 29-5979
Deutschland
Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben


More information about the Biopython-dev mailing list