[Biopython-dev] RE: blastpgp parsing buglet
Coleman, Michael
MKC at Stowers-Institute.org
Fri Jun 6 16:53:46 EDT 2003
Hi,
It looks like a further change is required on this. The problem is that when blank lines following 'CONVERGED' (and perhaps in other cases) are not consumed, _scan_alignments will see them and its tests will not work properly.
Mike
--- NCBIStandalone.py~ 2003-05-08 13:36:06.000000000 -0500
+++ NCBIStandalone.py 2003-06-06 15:38:31.000000000 -0500
@@ -247,6 +247,7 @@
read_and_call_while(uhandle, consumer.noevent, blank=1)
attempt_read_and_call(uhandle, consumer.converged, start='CONVERGED')
+ read_and_call_while(uhandle, consumer.noevent, blank=1)
consumer.end_descriptions()
> -----Original Message-----
> From: Coleman, Michael
> Sent: Thursday, May 08, 2003 1:45 PM
> To: biopython-dev at biopython.org
> Subject: blastpgp parsing buglet
>
>
> Parsing by NCBIStandalone.py fails for BLASTP 2.2.5 output.
> This is the partial output that trips the problem:
>
> gi|23099742|ref|NP_693208.1| ornithine aminotransferase
> [Oceanob... 430 e-119
> gi|16081241|ref|NP_393547.1| L-2,
> 4-diaminobutyrate:2-ketoglutar... 430 e-119
>
> Sequences not found previously or not previously below threshold:
>
> >gi|23466947|gb|ZP_00122533.1| hypothetical protein
> [Haemophilus somnus 129PT]
> Length = 432
>
> Score = 591 bits (1524), Expect = e-167
> Identities = 191/420 (45%), Positives = 291/420 (69%), Gaps
> = 7/420 (1%)
>
> The code expects to see a 'CONVERGED' but none is given here.
> One possible fix would be to also look for a line beginning
> with '>', like so
>
> # Read the descriptions and the following blank lines.
> read_and_call_while(uhandle, consumer.noevent, blank=1)
> l = safe_peekline(uhandle)
> if l[:9] != 'CONVERGED' and l[:1] != '>':
> read_and_call_until(uhandle,
> consumer.description, blank=1)
> read_and_call_while(uhandle,
> consumer.noevent, blank=1)
>
> Mike
>
> Mike Coleman, Scientific Programmer, +1 816 926 4419
> Stowers Institute for Biomedical Research
> 1000 E. 50th St., Kansas City, MO 64110
>
More information about the Biopython-dev
mailing list