[Biopython-dev] [Biopython - Bug #3399] (New) SearchIO hmmer3-text parser fails to parse hits that have large gaps

redmine at redmine.open-bio.org redmine at redmine.open-bio.org
Tue Dec 4 23:01:35 UTC 2012


Issue #3399 has been reported by Kai Blin.

----------------------------------------
Bug #3399: SearchIO hmmer3-text parser fails to parse hits that have large gaps
https://redmine.open-bio.org/issues/3399

Author: Kai Blin
Status: New
Priority: Normal
Assignee: 
Category: 
Target version: 
URL: 


While trying to parse a hit that has a really bad match to the profile, there might be alignment lines that don't contain query sequence characters at all. In that case the SearchIO hmmer3-text module currently throws a ValueError

<pre>
>>> it = SearchIO.parse('../broken.hsr', 'hmmer3-text')
>>> i = it.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "Bio/SearchIO/__init__.py", line 313, in parse
    for qresult in generator:
  File "Bio/SearchIO/HmmerIO/hmmer3_text.py", line 60, in __iter__
    for qresult in self._parse_qresult():
  File "Bio/SearchIO/HmmerIO/hmmer3_text.py", line 145, in _parse_qresult
    hit_list = self._parse_hit(qid)
  File "Bio/SearchIO/HmmerIO/hmmer3_text.py", line 188, in _parse_hit
    hit_list = self._create_hits(hit_list, qid)
  File "Bio/SearchIO/HmmerIO/hmmer3_text.py", line 309, in _create_hits
    self._parse_aln_block(hid, hit.hsps)
  File "Bio/SearchIO/HmmerIO/hmmer3_text.py", line 358, in _parse_aln_block
    frag.query = aliseq
  File "Bio/SearchIO/_model/hsp.py", line 816, in _query_set
    self._query = self._set_seq(value, 'query')
  File "Bio/SearchIO/_model/hsp.py", line 784, in _set_seq
    len(seq), seq_type))
ValueError: Sequence lengths do not match. Expected: 202 (hit); found: 131 (query).
</pre>

See the attached file broken.hsr for a dataset that triggers the error. If you remove the esterase hit (including the domain annotation), this error does not happen (broken2.hsr). If you insert fake position information into the query sequence line (broken3.hsr), the parser is happy again.


----------------------------------------
You have received this notification because this email was added to the New Issue Alert plugin


-- 
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here and login: http://redmine.open-bio.org




More information about the Biopython-dev mailing list