[Biopython-dev] [Bug 2353] New: Problem parsing Swissprot (UniProt) files
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Wed Aug 29 08:25:12 UTC 2007
http://bugzilla.open-bio.org/show_bug.cgi?id=2353
Summary: Problem parsing Swissprot (UniProt) files
Product: Biopython
Version: 1.43
Platform: Macintosh
OS/Version: Mac OS
Status: NEW
Severity: normal
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: ibdeno at gmail.com
I installed biopython-py24-1.43-1001 via fink on an iBook G4.
I have found that parsing a Uniprot database from the archaeon
M.thermoautotrophicum (downloaded from Integr8) using Bio.SwissProt produces
errors. For example, the code (in a file called testing.py):
8<--------------------------------------------
# reading a SwissProt entry from a file
from Bio.SwissProt import SProt
from sys import *
handle = open(argv[1])
sp = SProt.Iterator(handle, SProt.RecordParser())
record = sp.next()
print record.entry_name
print record.sequence
--------------------------------------------------->8
run as:
python2.4 testing.py 27.M_thermoautotrophicum.dat
gives:
Traceback (most recent call last):
File "testing.py", line 8, in ?
record = sp.next()
File "/sw/lib/python2.4/site-packages/Bio/SwissProt/SProt.py", line 172, in
next
return self._parser.parse(File.StringHandle(data))
File "/sw/lib/python2.4/site-packages/Bio/SwissProt/SProt.py", line 296, in
parse
self._scanner.feed(handle, self._consumer)
File "/sw/lib/python2.4/site-packages/Bio/SwissProt/SProt.py", line 338, in
feed
self._scan_record(uhandle, consumer)
File "/sw/lib/python2.4/site-packages/Bio/SwissProt/SProt.py", line 343, in
_scan_record
fn(self, uhandle, consumer)
File "/sw/lib/python2.4/site-packages/Bio/SwissProt/SProt.py", line 483, in
_scan_sq
self._scan_line('SQ', uhandle, consumer.sequence_header, exactly_one=1)
File "/sw/lib/python2.4/site-packages/Bio/SwissProt/SProt.py", line 365, in
_scan_line
read_and_call(uhandle, event_fn, start=line_type)
File "/sw/lib/python2.4/site-packages/Bio/ParserSupport.py", line 300, in
read_and_call
raise SyntaxError, errmsg
SyntaxError: Line does not start with 'SQ':
PE 3: Inferred from homology;
I have found that this is due to the presence in this file of lines starting
with "PE" (as in the example) or with "**". Once I eliminate these lines, there
is no problem. In my opinion the parser should deal more elegantly with cases
were the records don't have a recognized start...
Cheers,
Miguel
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list