Bioperl: Yet another question about parsing blast results.

Heil, Jeremy Jeremy.Heil@celera.com
Fri, 17 Dec 1999 15:25:25 -0500


This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

------_=_NextPart_001_01BF48CC.C3CC959A
Content-Type: text/plain;
	charset="iso-8859-1"

I have experienced the same problems under the same conditions you state.  I
*think* I have the following kludge, let me know if it works for you...

~line: 1900 Blast.pm, _parse_header

...

$data =~ /WARNING: (.+?)$Newline$Newline/so and $self->warn("$1") if
$self->strict;
$data =~ /FATAL: (.+?)$Newline$Newline/so and $self->throw("FATAL BLAST
ERROR = $1"); 
# No longer throwing exception when no hits were found. Still reporting it.
$data =~ /No hits? found/i and $self->warn("No hits were found.") if
$self->strict; 

#****FIX : Problem : exception thrown if last match is a 'no hitter'
if ( $data =~ /No hits found/ && not ( $data =~ /Sequences producing
significant/ ) ) {
	return 0;
}
#END FIX

# If this is the first Blast, the program, version, and database info
# pertain to it. Otherwise, they are for the previous report and have
# already been parsed out.
# Data is stored in the static Blast object. Data for subsequent reports

..     

Take Care, 

	Jeremy Heil  
	SNP Discovery 
	Celera Genomics

>Oh, and there's one more thing.
>The program has no trouble skipping over 'no hit' reports at the beginning
or 
>middle. The problem is if the last report(s) in the file are 'no hit'
reports.
>That's when it croaks and throws an exception, and also ends up missing the
>previous 'hit' that was successful.
>
>
>carl

------_=_NextPart_001_01BF48CC.C3CC959A
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Diso-8859-1">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
5.5.2650.12">
<TITLE>RE: Bioperl: Yet another question about parsing blast results. =
</TITLE>
</HEAD>
<BODY>

<P><FONT SIZE=3D2>I have experienced the same problems under the same =
conditions you state.&nbsp; I *think* I have the following kludge, let =
me know if it works for you...</FONT></P>

<P><FONT SIZE=3D2>~line: 1900 Blast.pm, _parse_header</FONT>
</P>

<P><FONT SIZE=3D2>...</FONT>
</P>

<P><FONT SIZE=3D2>$data =3D~ /WARNING: (.+?)$Newline$Newline/so and =
$self-&gt;warn(&quot;$1&quot;) if $self-&gt;strict;</FONT>
<BR><FONT SIZE=3D2>$data =3D~ /FATAL: (.+?)$Newline$Newline/so and =
$self-&gt;throw(&quot;FATAL BLAST ERROR =3D $1&quot;); </FONT>
<BR><FONT SIZE=3D2># No longer throwing exception when no hits were =
found. Still reporting it.</FONT>
<BR><FONT SIZE=3D2>$data =3D~ /No hits? found/i and =
$self-&gt;warn(&quot;No hits were found.&quot;) if $self-&gt;strict; =
</FONT>
</P>

<P><FONT SIZE=3D2>#****FIX : Problem : exception thrown if last match =
is a 'no hitter'</FONT>
<BR><FONT SIZE=3D2>if ( $data =3D~ /No hits found/ &amp;&amp; not ( =
$data =3D~ /Sequences producing significant/ ) ) {</FONT>
<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <FONT SIZE=3D2>return =
0;</FONT>
<BR><FONT SIZE=3D2>}</FONT>
<BR><FONT SIZE=3D2>#END FIX</FONT>
</P>

<P><FONT SIZE=3D2># If this is the first Blast, the program, version, =
and database info</FONT>
<BR><FONT SIZE=3D2># pertain to it. Otherwise, they are for the =
previous report and have</FONT>
<BR><FONT SIZE=3D2># already been parsed out.</FONT>
<BR><FONT SIZE=3D2># Data is stored in the static Blast object. Data =
for subsequent reports</FONT>
</P>

<P><FONT SIZE=3D2>..&nbsp;&nbsp;&nbsp;&nbsp; </FONT>
</P>

<P><FONT SIZE=3D2>Take Care, </FONT>
</P>

<P>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <FONT SIZE=3D2>Jeremy =
Heil&nbsp; </FONT>
<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <FONT SIZE=3D2>SNP =
Discovery </FONT>
<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <FONT SIZE=3D2>Celera =
Genomics</FONT>
</P>

<P><FONT SIZE=3D2>&gt;Oh, and there's one more thing.</FONT>
<BR><FONT SIZE=3D2>&gt;The program has no trouble skipping over 'no =
hit' reports at the beginning or </FONT>
<BR><FONT SIZE=3D2>&gt;middle. The problem is if the last report(s) in =
the file are 'no hit' reports.</FONT>
<BR><FONT SIZE=3D2>&gt;That's when it croaks and throws an exception, =
and also ends up missing the</FONT>
<BR><FONT SIZE=3D2>&gt;previous 'hit' that was successful.</FONT>
<BR><FONT SIZE=3D2>&gt;</FONT>
<BR><FONT SIZE=3D2>&gt;</FONT>
<BR><FONT SIZE=3D2>&gt;carl</FONT>
</P>

</BODY>
</HTML>
------_=_NextPart_001_01BF48CC.C3CC959A--
=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================