[Biopython-dev] blastpgp parsing buglet
Coleman, Michael
MKC at Stowers-Institute.org
Thu May 8 14:45:27 EDT 2003
Parsing by NCBIStandalone.py fails for BLASTP 2.2.5 output. This is the partial output that trips the problem:
gi|23099742|ref|NP_693208.1| ornithine aminotransferase [Oceanob... 430 e-119
gi|16081241|ref|NP_393547.1| L-2, 4-diaminobutyrate:2-ketoglutar... 430 e-119
Sequences not found previously or not previously below threshold:
>gi|23466947|gb|ZP_00122533.1| hypothetical protein [Haemophilus somnus 129PT]
Length = 432
Score = 591 bits (1524), Expect = e-167
Identities = 191/420 (45%), Positives = 291/420 (69%), Gaps = 7/420 (1%)
The code expects to see a 'CONVERGED' but none is given here. One possible fix would be to also look for a line beginning with '>', like so
# Read the descriptions and the following blank lines.
read_and_call_while(uhandle, consumer.noevent, blank=1)
l = safe_peekline(uhandle)
if l[:9] != 'CONVERGED' and l[:1] != '>':
read_and_call_until(uhandle, consumer.description, blank=1)
read_and_call_while(uhandle, consumer.noevent, blank=1)
Mike
Mike Coleman, Scientific Programmer, +1 816 926 4419
Stowers Institute for Biomedical Research
1000 E. 50th St., Kansas City, MO 64110
More information about the Biopython-dev
mailing list