[BioPython] Biopython BLAST parser error
m at pavis.biodec.com
m at pavis.biodec.com
Mon May 26 17:06:08 EDT 2003
Hello
I've got a problem parsing the following output:
http://pavis.biodec.com/~m/1be3_C.blast.bz2 (warning *big file*)
It was written by
blastpgp -d nr -e 1e-9 -b 10000 -v 10000 -j3
on this protein: 1be3.pdb C.chain
MTNIRKSHPLMKIVNNAFIDLPAPSNISSWWNFGSLLGICLILQILTGLFLAMHYTSDTTTA
FSSVTHICRDVNYGWIIRYMHANGASMFFICLYMHVGRGLYYGSYTFLETWNIGVILLLTVM
ATAFMGYVLPWGQMSFWGATVITNLLSAIPYIGTNLVEWIWGGFSVDKATLTRFFAFHFILP
FIIMAIAMVHLLFLHETGSNNPTGISSDVDKIPFHPYYTIKDILGALLLILALMLLVLFAPD
LLGDPDNYTPANPLNTPPHIKPEWYFLFAYAILRSIPNKLGGVLALAFSILILALIPLLHTS
KQRSMMFRPLSQCLFWALVADLLTLTWIGGQPVEHPYITIGQLASVLYFLLILVLMPTAGTI
ENKLLKW
The Blast version that I am running is
BLASTP 2.2.4 [Aug-26-2002]
but I've got the same behaviour with BLASTP 2.1.3 [Apr-1-2001]
I do not know the nr version number, but it is the databases
with 705,002 sequences; 222,117,092 total letters (sorry for not
having better information)
As I see it, the problem lies in parsing the rows number 298999
and number 588182, where the is the phrase:
``Sequences not found previously or not previously below threshold:''
I am testing BioPython 1.10a, under Python 2.2.2, GNU / Linux /
Debian Unstable
p.s.: the trace of the error, apart from irrrelevant data, is
blast_parse=parser.parse(UndoHandle(blastfile))
File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIStandalone.py",
line 611, in parse
self._scanner.feed(handle, self._consumer)
File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIStandalone.py",
line 84, in feed
self._scan_rounds(uhandle, consumer)
File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIStandalone.py",
line 139, in _scan_rounds
self._scan_descriptions(uhandle, consumer)
File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIStandalone.py",
line 245, in _scan_descriptions
read_and_call_until(uhandle, consumer.description, blank=1)
File "/usr/lib/python2.2/site-packages/Bio/ParserSupport.py", line
371, in read_and_call_until
method(line)
File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIStandalone.py",
line 677, in description
dh = self._parse(line)
File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIStandalone.py",
line 734, in _parse
dh.score = _safe_int(dh.score)
File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIStandalone.py",
line 1633, in _safe_int
return long(float(str))
ValueError: invalid literal for float(): b
--
.*. finelli
/V\
(/ \) --------------------------------------------------------------
( ) Linux: Friends dont let friends use Piccolosoffice
^^-^^ --------------------------------------------------------------
And the crowd was stilled. One elderly man, wondering at the sudden silence,
turned to the Child and asked him to repeat what he had said. Wide-eyed,
the Child raised his voice and said once again, "Why, the Emperor has no
clothes! He is naked!"
- "The Emperor's New Clothes"
More information about the BioPython
mailing list