[Biopython-dev] RE: blastpgp parsing buglet

Jeffrey Chang jchang at jeffchang.com
Sun Jun 8 22:11:35 EDT 2003


Thanks very much for the patch.  I've committed it to the CVS tree.

Jeff


On Friday, June 6, 2003, at 01:53  PM, Coleman, Michael wrote:

> Hi,
>
> It looks like a further change is required on this.  The problem is 
> that when blank lines following 'CONVERGED' (and perhaps in other 
> cases) are not consumed, _scan_alignments will see them and its tests 
> will not work properly.
>
> Mike
>
>
>
>
>
>
>
>
> --- NCBIStandalone.py~  2003-05-08 13:36:06.000000000 -0500
> +++ NCBIStandalone.py   2003-06-06 15:38:31.000000000 -0500
> @@ -247,6 +247,7 @@
>                  read_and_call_while(uhandle, consumer.noevent, 
> blank=1)
>
>          attempt_read_and_call(uhandle, consumer.converged, 
> start='CONVERGED')
> +       read_and_call_while(uhandle, consumer.noevent, blank=1)
>
>          consumer.end_descriptions()
>
>> -----Original Message-----
>> From: Coleman, Michael
>> Sent: Thursday, May 08, 2003 1:45 PM
>> To: biopython-dev at biopython.org
>> Subject: blastpgp parsing buglet
>>
>>
>> Parsing by NCBIStandalone.py fails for BLASTP 2.2.5 output.
>> This is the partial output that trips the problem:
>>
>> gi|23099742|ref|NP_693208.1| ornithine aminotransferase
>> [Oceanob...   430   e-119
>> gi|16081241|ref|NP_393547.1| L-2,
>> 4-diaminobutyrate:2-ketoglutar...   430   e-119
>>
>> Sequences not found previously or not previously below threshold:
>>
>>> gi|23466947|gb|ZP_00122533.1| hypothetical protein
>> [Haemophilus somnus 129PT]
>>           Length = 432
>>
>>  Score =  591 bits (1524), Expect = e-167
>>  Identities = 191/420 (45%), Positives = 291/420 (69%), Gaps
>> = 7/420 (1%)
>>
>> The code expects to see a 'CONVERGED' but none is given here.
>>  One possible fix would be to also look for a line beginning
>> with '>', like so
>>
>>             # Read the descriptions and the following blank lines.
>>             read_and_call_while(uhandle, consumer.noevent, blank=1)
>>             l = safe_peekline(uhandle)
>>             if l[:9] != 'CONVERGED' and l[:1] != '>':
>>                 read_and_call_until(uhandle,
>> consumer.description, blank=1)
>>                 read_and_call_while(uhandle,
>> consumer.noevent, blank=1)
>>
>> Mike
>>
>> Mike Coleman, Scientific Programmer, +1 816 926 4419
>> Stowers Institute for Biomedical Research
>> 1000 E. 50th St., Kansas City, MO  64110
>>
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at biopython.org
> http://biopython.org/mailman/listinfo/biopython-dev




More information about the Biopython-dev mailing list