[Biopython-dev] [Bug 1948] uniprot release 49/SProt.Record Parser Problem

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Mon Feb 13 03:21:38 EST 2006


http://bugzilla.open-bio.org/show_bug.cgi?id=1948





------- Comment #2 from gould at embl.de  2006-02-13 03:21 -------
(In reply to comment #0)
> I've been having problems with some of our applications that use biopython
> scripts to retrieve a record from uniprot/swissprot given an accession
> nr/ID....As far as I'm aware the problem only occurred after the release 49.0
> of uniprot/swissprot db on 6th Feb...I see from the release notes that some
> changes were made to the annotation format and suspect this is why the
> biopython scripts are no longer happy??....I've checked to make sure I have the
> latest version of biopython but this has not remedied the problem.....This
> problem would seem to lie with biopython.
> Are any fixes is to be made available??
> An example of the error being thrown is below:
> 
> Python 2.4 (#1, Dec 10 2004, 11:49:12)
> [GCC 3.3.1 (SuSE Linux)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> from Bio.WWW import ExPASy
> >>> from Bio.SwissProt import SProt
> >>> from Bio import File
> >>> acc='Q14155'
> >>> results = ExPASy.get_sprot_raw(acc.strip()).read()
> >>>  sp_parser = SProt.RecordParser()
>   File "<stdin>", line 1
>     sp_parser = SProt.RecordParser()
>     ^
> SyntaxError: invalid syntax
> >>>  sp_parser = SProt.RecordParse
>   File "<stdin>", line 1
>     sp_parser = SProt.RecordParse
>     ^
> SyntaxError: invalid syntax
> >>> sp_parser = SProt.RecordParser()
> >>> sp_iterator = SProt.Iterator(File.StringHandle(results), sp_parser)
> >>> Record = sp_iterator.next()
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "/usr/local/lib/python2.4/site-packages/Bio/SwissProt/SProt.py", line
> 166
> , in next
>     return self._parser.parse(File.StringHandle(data))
>   File "/usr/local/lib/python2.4/site-packages/Bio/SwissProt/SProt.py", line
> 290                                                                            
>                                         , in parse
>     self._scanner.feed(handle, self._consumer)
>   File "/usr/local/lib/python2.4/site-packages/Bio/SwissProt/SProt.py", line
> 332                                                                            
>                                         , in feed
>     self._scan_record(uhandle, consumer)
>   File "/usr/local/lib/python2.4/site-packages/Bio/SwissProt/SProt.py", line
> 337                                                                            
>                                         , in _scan_record
>     fn(self, uhandle, consumer)
>   File "/usr/local/lib/python2.4/site-packages/Bio/SwissProt/SProt.py", line
> 369                                                                            
>                                         , in _scan_id
>     self._scan_line('ID', uhandle, consumer.identification, exactly_one=1)
>   File "/usr/local/lib/python2.4/site-packages/Bio/SwissProt/SProt.py", line
> 359                                                                            
>                                         , in _scan_line
>     read_and_call(uhandle, event_fn, start=line_type)
>   File "/usr/local/lib/python2.4/site-packages/Bio/ParserSupport.py", line 300,
>                                                                                
>                                      in read_and_ca
> 
> ll
>     raise SyntaxError, errmsg
> SyntaxError: Line does not start with 'ID':
> <HTML LANG="EN">
> 
> >>>
> 

(In reply to comment #1)
> I'm not familar with this module, but I get a rather different result.
> 
> Could you attached the file that ExPASy.get_sprot_raw() returns to this bug? 
> It looks like you got an HTML file back - I would guess this was an error page
> due to a temporary problem.  If you try again I think something else will
> happen...
> 
> When I just did this on Windows, I did get a valid looking file back, but
> BioPython still failed to parse it:
> 
> from Bio.WWW import ExPASy
> from Bio.SwissProt import SProt
> from Bio import File
> acc='Q14155'
> results = ExPASy.get_sprot_raw(acc.strip()).read()
> sp_parser = SProt.RecordParser()
> sp_iterator = SProt.Iterator(File.StringHandle(results), sp_parser)
> Record = sp_iterator.next()
> 
> It also failed at the iterator next step, but in a different way:
> Traceback (most recent call last):
>   File "c:\temp\bug1948.py", line 8, in -toplevel-
>     Record = sp_iterator.next()
>   File "C:\Python23\lib\site-packages\Bio\SwissProt\SProt.py", line 166, in
> next
>     return self._parser.parse(File.StringHandle(data))
>   File "C:\Python23\lib\site-packages\Bio\SwissProt\SProt.py", line 290, in
> parse
>     self._scanner.feed(handle, self._consumer)
>   File "C:\Python23\lib\site-packages\Bio\SwissProt\SProt.py", line 332, in
> feed
>     self._scan_record(uhandle, consumer)
>   File "C:\Python23\lib\site-packages\Bio\SwissProt\SProt.py", line 337, in
> _scan_record
>     fn(self, uhandle, consumer)
>   File "C:\Python23\lib\site-packages\Bio\SwissProt\SProt.py", line 378, in
> _scan_dt
>     self._scan_line('DT', uhandle, consumer.date, exactly_one=1)
>   File "C:\Python23\lib\site-packages\Bio\SwissProt\SProt.py", line 359, in
> _scan_line
>     read_and_call(uhandle, event_fn, start=line_type)
>   File "C:\Python23\lib\site-packages\Bio\ParserSupport.py", line 301, in
> read_and_call
>     method(line)
>   File "C:\Python23\lib\site-packages\Bio\SwissProt\SProt.py", line 551, in
> date
>     assert rel_index >= 0, \
> AssertionError: Could not find Rel. in DT line: DT   01-NOV-1997, integrated
> into UniProtKB/Swiss-Prot.
> 
> 
> 
> Looking at the file returned gave:
> 
> >>> print results
> ID   ARHG7_HUMAN    STANDARD;      PRT;   803 AA.
> AC   Q14155; Q6P9G3; Q6PII2; Q86W63; Q8N3M1;
> DT   01-NOV-1997, integrated into UniProtKB/Swiss-Prot.
> DT   19-JUL-2004, sequence version 2.
> DT   07-FEB-2006, entry version 55.
> DE   Rho guanine nucleotide exchange factor 7 (PAK-interacting exchange
> DE   factor beta) (Beta-Pix) (COOL-1) (p85).
> ...
> //
> 
> Reading Bio/SwissProt/Spot.py class _RecordConsumer method date(), none of
> those three DT lines look like what the code is expecting.
> 


I'm not sure I follow what you are saying....I don't have a problem reading the
file and get the same result as you did.. The problem is parsing the results(as
the error abaove occurs)




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


More information about the Biopython-dev mailing list