[Biopython-dev] SwissProt fails to parse the current uniprot_sprot data file?
Jinghua (Frank) Feng
Jinghua.Feng at adelaide.edu.au
Tue Oct 21 03:45:33 UTC 2014
Hello,
It looks like SwissProt can parse old version uniprot_sprot data file,
but fails with the current version data file. Below is how to replicate
the error (Biopython version is '1.64').
Regards,
Jinghua
----------------------
First download the current uniprot_sprot data file (~72 MB in size) at
ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/taxonomic_divisions/uniprot_sprot_human.dat.gz
Then in IPython, using SwissProt to parse the downloaded data file:
In [1]: from Bio import SwissProt
In [2]: import gzip
In [3]: inhandle = gzip.open('./uniprot_sprot_human.dat.gz')
In [4]: reader = SwissProt.parse(inhandle)
In [5]: for r in reader:
...: pass
...:
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-5-c04351d992d2> in <module>()
----> 1 for r in reader:
2 pass
3
/usr/local/lib/python2.7/dist-packages/Bio/SwissProt/__init__.pyc in
parse(handle)
115 def parse(handle):
116 while True:
--> 117 record = _read(handle)
118 if not record:
119 return
/usr/local/lib/python2.7/dist-packages/Bio/SwissProt/__init__.pyc in
_read(handle)
182 elif key == 'RN':
183 reference = Reference()
--> 184 _read_rn(reference, value)
185 record.references.append(reference)
186 elif key == 'RP':
/usr/local/lib/python2.7/dist-packages/Bio/SwissProt/__init__.pyc in
_read_rn(reference, rn)
407
408 def _read_rn(reference, rn):
--> 409 assert rn[0] == '[' and rn[-1] == ']', "Missing brackets %s"
% rn
410 reference.number = int(rn[1:-1])
411
AssertionError: Missing brackets [1] {ECO:0000305,
ECO:0000312|EMBL:AAK11482.1}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython-dev/attachments/20141021/c9a8930b/attachment.html>
More information about the Biopython-dev
mailing list