[BioPython] Prosite / Prorule

holger.dinkel at gmail.com holger.dinkel at gmail.com
Tue Nov 20 12:35:15 UTC 2007


Hallo Peter,

thank you very much for your real quick help!

that bug is fixed! ;->

But alas, there are still some errors thrown when scanning the whole prosite_20.dat:
(they only show up now since the other errors were fixed)

Firstly, the Prosite-Team had also introduced a new field called
"postprocessing", so now the parser chokes on that.  

And secondly the parser breaks at some special comment-lines with authornames
in it of the form "CC /AUTHOR=K_Hofmann; N_Hulo" (Prosite-Acc PS50293): The
comments are split into columns and then parsed into values at the
"="-letter. As Mr. Hulo does not have a "/Author=" prepended, an error is
raised...

I was able to fix the first problem straightforward as Peter did and inserted a postprocessing-entry.

I could also solve the second problem, but only with some hack which might not suit everybody:

First, i split the "qual, data = [word.lstrip() for word in col.split("=")]" into two to avoid KeyErrors:
qual = [word.lstrip() for word in col.split("=")][0]
data = ''.join([word.lstrip() for word in col.split("=")][1:]) 

and then i introduced a hack to circumvent the aforementioned problem:

changed
    if qual == '/TAXO-RANGE':

to
    if qual == 'N_Hulo':
        continue
    elif qual == '/TAXO-RANGE':


I know this is far from excellent, but crude enough to work ;->

If you'd like to incorporate at least the first changes, you can find the 'new'
__init__.py file attached at the bug #2403 as whole file as well as a patch. It
succesfully scans prosite version 18 to 20 (others not checked).  I could also
send it to the list, but I am not sure if mails with attachments are allowed
here?

* Peter wrote:
>
> Holger reported bug 2403, which I believe I have fixed (having worked
> with our SwissProt parser before I found this quite straight forward):
> http://bugzilla.open-bio.org/show_bug.cgi?id=2403
> 
> Peter

best wishes,

Holger
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/biopython/attachments/20071120/dea2f198/attachment.sig>


More information about the Biopython mailing list