[Biopython-dev] [Bug 1948] uniprot release 49/SProt.Record Parser
Problem
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Mon Feb 13 03:21:38 EST 2006
http://bugzilla.open-bio.org/show_bug.cgi?id=1948
------- Comment #2 from gould at embl.de 2006-02-13 03:21 -------
(In reply to comment #0)
> I've been having problems with some of our applications that use biopython
> scripts to retrieve a record from uniprot/swissprot given an accession
> nr/ID....As far as I'm aware the problem only occurred after the release 49.0
> of uniprot/swissprot db on 6th Feb...I see from the release notes that some
> changes were made to the annotation format and suspect this is why the
> biopython scripts are no longer happy??....I've checked to make sure I have the
> latest version of biopython but this has not remedied the problem.....This
> problem would seem to lie with biopython.
> Are any fixes is to be made available??
> An example of the error being thrown is below:
>
> Python 2.4 (#1, Dec 10 2004, 11:49:12)
> [GCC 3.3.1 (SuSE Linux)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> from Bio.WWW import ExPASy
> >>> from Bio.SwissProt import SProt
> >>> from Bio import File
> >>> acc='Q14155'
> >>> results = ExPASy.get_sprot_raw(acc.strip()).read()
> >>> sp_parser = SProt.RecordParser()
> File "<stdin>", line 1
> sp_parser = SProt.RecordParser()
> ^
> SyntaxError: invalid syntax
> >>> sp_parser = SProt.RecordParse
> File "<stdin>", line 1
> sp_parser = SProt.RecordParse
> ^
> SyntaxError: invalid syntax
> >>> sp_parser = SProt.RecordParser()
> >>> sp_iterator = SProt.Iterator(File.StringHandle(results), sp_parser)
> >>> Record = sp_iterator.next()
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> File "/usr/local/lib/python2.4/site-packages/Bio/SwissProt/SProt.py", line
> 166
> , in next
> return self._parser.parse(File.StringHandle(data))
> File "/usr/local/lib/python2.4/site-packages/Bio/SwissProt/SProt.py", line
> 290
> , in parse
> self._scanner.feed(handle, self._consumer)
> File "/usr/local/lib/python2.4/site-packages/Bio/SwissProt/SProt.py", line
> 332
> , in feed
> self._scan_record(uhandle, consumer)
> File "/usr/local/lib/python2.4/site-packages/Bio/SwissProt/SProt.py", line
> 337
> , in _scan_record
> fn(self, uhandle, consumer)
> File "/usr/local/lib/python2.4/site-packages/Bio/SwissProt/SProt.py", line
> 369
> , in _scan_id
> self._scan_line('ID', uhandle, consumer.identification, exactly_one=1)
> File "/usr/local/lib/python2.4/site-packages/Bio/SwissProt/SProt.py", line
> 359
> , in _scan_line
> read_and_call(uhandle, event_fn, start=line_type)
> File "/usr/local/lib/python2.4/site-packages/Bio/ParserSupport.py", line 300,
>
> in read_and_ca
>
> ll
> raise SyntaxError, errmsg
> SyntaxError: Line does not start with 'ID':
> <HTML LANG="EN">
>
> >>>
>
(In reply to comment #1)
> I'm not familar with this module, but I get a rather different result.
>
> Could you attached the file that ExPASy.get_sprot_raw() returns to this bug?
> It looks like you got an HTML file back - I would guess this was an error page
> due to a temporary problem. If you try again I think something else will
> happen...
>
> When I just did this on Windows, I did get a valid looking file back, but
> BioPython still failed to parse it:
>
> from Bio.WWW import ExPASy
> from Bio.SwissProt import SProt
> from Bio import File
> acc='Q14155'
> results = ExPASy.get_sprot_raw(acc.strip()).read()
> sp_parser = SProt.RecordParser()
> sp_iterator = SProt.Iterator(File.StringHandle(results), sp_parser)
> Record = sp_iterator.next()
>
> It also failed at the iterator next step, but in a different way:
> Traceback (most recent call last):
> File "c:\temp\bug1948.py", line 8, in -toplevel-
> Record = sp_iterator.next()
> File "C:\Python23\lib\site-packages\Bio\SwissProt\SProt.py", line 166, in
> next
> return self._parser.parse(File.StringHandle(data))
> File "C:\Python23\lib\site-packages\Bio\SwissProt\SProt.py", line 290, in
> parse
> self._scanner.feed(handle, self._consumer)
> File "C:\Python23\lib\site-packages\Bio\SwissProt\SProt.py", line 332, in
> feed
> self._scan_record(uhandle, consumer)
> File "C:\Python23\lib\site-packages\Bio\SwissProt\SProt.py", line 337, in
> _scan_record
> fn(self, uhandle, consumer)
> File "C:\Python23\lib\site-packages\Bio\SwissProt\SProt.py", line 378, in
> _scan_dt
> self._scan_line('DT', uhandle, consumer.date, exactly_one=1)
> File "C:\Python23\lib\site-packages\Bio\SwissProt\SProt.py", line 359, in
> _scan_line
> read_and_call(uhandle, event_fn, start=line_type)
> File "C:\Python23\lib\site-packages\Bio\ParserSupport.py", line 301, in
> read_and_call
> method(line)
> File "C:\Python23\lib\site-packages\Bio\SwissProt\SProt.py", line 551, in
> date
> assert rel_index >= 0, \
> AssertionError: Could not find Rel. in DT line: DT 01-NOV-1997, integrated
> into UniProtKB/Swiss-Prot.
>
>
>
> Looking at the file returned gave:
>
> >>> print results
> ID ARHG7_HUMAN STANDARD; PRT; 803 AA.
> AC Q14155; Q6P9G3; Q6PII2; Q86W63; Q8N3M1;
> DT 01-NOV-1997, integrated into UniProtKB/Swiss-Prot.
> DT 19-JUL-2004, sequence version 2.
> DT 07-FEB-2006, entry version 55.
> DE Rho guanine nucleotide exchange factor 7 (PAK-interacting exchange
> DE factor beta) (Beta-Pix) (COOL-1) (p85).
> ...
> //
>
> Reading Bio/SwissProt/Spot.py class _RecordConsumer method date(), none of
> those three DT lines look like what the code is expecting.
>
I'm not sure I follow what you are saying....I don't have a problem reading the
file and get the same result as you did.. The problem is parsing the results(as
the error abaove occurs)
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list