[BioPython] Changes in NCBI BLAST output format !!??
aurelie.bornot at free.fr
aurelie.bornot at free.fr
Tue Jul 19 09:08:20 EDT 2005
Hi !
I've got the same problem as Jessica Leigh (in the Discussion List) :
When I try to parse a BLAST file with a script that worked until the beginning
of July, I get this syntax error :
Line does not contain 'Database':
(Blank line)
It seem that the NCBI has made changes :
-"Old" blast file :
<p>
<b>Query=</b> sequence
(569 letters)
<p>
<b>Database:</b> All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS,
GSS,environmental samples or phase 0, 1 or 2 HTGS sequences)
3,047,402 sequences; 13,743,552,639 total letters
<p> <p>If you have any problems or questions with the results...
-New Blast file :
<b>Query=</b> sequence
(540 letters)
<b>Database:</b> All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS,
GSS,environmental samples or phase 0, 1 or 2 HTGS sequences)
3,312,348 sequences; 14,588,094,788 total letters
<p> <p>If you have any problems or questions with the...
The <p> before Query and Database are missing !!!
And the fact is that in Python24\Lib\site-packages\Bio\Blast\NCBIWWW.py, it
seems that the code to find "Database" uses the <p> :
def _scan_database_info(self, uhandle, consumer):
attempt_read_and_call(uhandle, consumer.noevent, start='<p>')
read_and_call(uhandle, consumer.database_info, contains='Database')
....
I'm not sure to have a good understanding of what happens...
But could someone help...
I don't know what to do. Is it possible to correct the problem easily ?
Thanks a lot !!
Aurelie
--------------
Aurelie BORNOT
MNHN
Paris
More information about the BioPython
mailing list