[Bioperl-l] Bio::Tools::Blast::HTML questions

Zhao, David [PRI] DZhao1@prius.jnj.com
Mon, 18 Sep 2000 13:20:40 -0400


This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

------_=_NextPart_001_01C02194.C49AF560
Content-Type: text/plain;
	charset="iso-8859-1"

First of all, thank you all very much for your reply.
I've looked at the codes, it was "\w+" in my HTML.pm, but I didn't realize I
had 0.6.0, instead of 0.6.1.
I'll try to write a better bug report next time.
Hope this won't affect my future bug reports.
Thanks again
David

 
> -----Original Message-----
> From:	Andrew Dalke [SMTP:dalke@acm.org]
> Sent:	Friday, September 15, 2000 10:08 PM
> To:	Zhao, David  [PRI]
> Cc:	'Bio-Perl'
> Subject:	Re: [Bioperl-l] Bio::Tools::Blast::HTML questions
> 
> Zhao, David [PRI] <DZhao1@prius.jnj.com> said:
> > It seems that nobody has had the same problem, or you guy think
> > this is just not significant enough to be answered.
> 
> Actually, there could be several other reasons.  For example, almost
> the only time I get HTML formatted mail is from junk/spam mail, so
> I have an almost instinctual urge to delete those mails when I see
> them.  Since your message in no way needed the extra abilities of
> HTML, you should have used ASCII instead.
> 
> Second, it was hard to figure out what the problem was, if just
> given your description.  A more helpful report might have been
> 
> ] Hi there,
> ]   It seems the HTML module doesn't recognize the genbank format in
> ] the summary table.  The lines look like:
> ]
> ] dbj|AU027194.1|AU027194 Rattus norvegicus, OTSUKA clone, OT17.21...
> 52
> 4e-06
> ]
> ] When I replace the ".1" in "AU027194.1" with "AU027194" it works.
> ] Here's the full table:
> ] ...
> 
> It isn't much harder to write than your original email, but it is much
> easier for someone else to understand.
> 
> It would also be helpful to know what "doesn't recognize" means.  Does
> it stop with an error?  Is the line just ignored?  Is the rest of the
> input file ignored?
> 
> The better a bug report is, the more likely it will be answered.  But
> it takes effort and practice to learn how to write a good report.
> 
> Third, given that it's been over 24 hours, you could have messed around
> with the code yourself.  A good bug report can almost guide you to where
> to
> look in the code.
> 
> In this case, that's the code for parsing the summary table, most likely
> related to parsing genbank lines.  From a quick perusal, the problem would
> likely be in the section:
> 
> ## REGEXPS FOR SUMMARY TABLE LINES AT TOP OF REPORT (a.k.a.
> 'descriptions')
> ## (table of sequence id, description, score, P/Expect value, n)
> ##
> ## Not using bold face to highlight the sequence id's since this can throw
> off
> ## off formatting of the line when the IDs are different lengths. This
> lead
> to
> ## the scores and P/Expect values not lining up properly.
> 
>     ### NCBI-specific markups for description lines:
> 
>   # GenBank/EMBL, DDBJ hits (GenBank Format):
>   s@^ ?(gb|emb|dbj)\|($Word)(\|$Word)?($Descrip)($Int
> +)($Signif)(.*)$@$1:<a
> href=
> "$DbUrl{'gb_n'}$2">$2$3</a>$4$5<A href="\#$2_A">$6</a>$7<a
> name="$2_H"></a>@o;
> 
> It wasn't very hard to find this code.
> 
> If you look at the definition of "Word" you'll see it is defined as
> "[\w_.]"
> so this pattern *should* match the data line you give.
> 
> So in a followup email you could describe your hypothesis of the problem
> and what you've done to track it down.
> 
> 
> Fourth, given that the code is correct, you could check the CVS logs,
> available even to anonymous external users, and see
> 
> revision 1.3.2.1
> date: 2000/05/18 20:53:31;  author: sac;  state: Exp;  lines: +6 -4
> - The $Word and $Acc strings now include '.' to accomodate accessions with
>   version number. Word also allows '_'  to work with ref seq accessions.
> - Silencing warnings during _markup_report.
> 
> Checking bioperl-0.6.1.tar.gz (with the file datestamp on the ftp site of
> May 19, 2000, so the day after Steve's fix) you'll see that the code
> contains the fix mentioned in the CVS log.
> 
> So the answer to your statement:
> > It seems that nobody has had the same problem, or you guy think
> > this is just not significant enough to be answered.
> 
> is that it has been seen, corrected, and distributed almost 4 months
> ago, so nobody has the problem.  You need to update your distribution.
> Also, I'll bet that only 2 or 3 people ever saw the bug before it was
> fixed, so most of the people on the list really have not ever seen
> the problem and could not answer your email without spending non-trivial
> time digging through the back logs.
> 
> This also means that if you are submitting a bug report, you do need
> to include the version number in which you found the problem.
> 
>                     Andrew Dalke
>                     dalke@acm.org
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l

------_=_NextPart_001_01C02194.C49AF560
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Dus-ascii">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
5.5.2651.75">
<TITLE>RE: [Bioperl-l] Bio::Tools::Blast::HTML questions</TITLE>
</HEAD>
<BODY>

<P><FONT COLOR=3D"#800000" FACE=3D"Verdana">First of all, thank you all =
very much for your reply.</FONT>
<BR><FONT COLOR=3D"#800000" FACE=3D"Verdana">I've looked at the codes, =
it was &quot;\w+&quot; in my HTML.pm, but I didn't realize I had 0.6.0, =
instead of 0.6.1.</FONT>
<BR><FONT COLOR=3D"#800000" FACE=3D"Verdana">I'll try to write a better =
bug report next time.</FONT>
<BR><FONT COLOR=3D"#800000" FACE=3D"Verdana">Hope this won't affect my =
future bug reports.</FONT>
<BR><FONT COLOR=3D"#800000" FACE=3D"Verdana">Thanks again</FONT>
<BR><FONT COLOR=3D"#800000" FACE=3D"Verdana">David</FONT>
</P>

<P><FONT COLOR=3D"#800000" FACE=3D"Verdana"></FONT>&nbsp;
<BR><FONT SIZE=3D1 FACE=3D"Arial">-----Original Message-----</FONT>
<BR><B><FONT SIZE=3D1 FACE=3D"Arial">From:&nbsp;&nbsp;</FONT></B> <FONT =
SIZE=3D1 FACE=3D"Arial">Andrew Dalke [SMTP:dalke@acm.org]</FONT>
<BR><B><FONT SIZE=3D1 FACE=3D"Arial">Sent:&nbsp;&nbsp;</FONT></B> <FONT =
SIZE=3D1 FACE=3D"Arial">Friday, September 15, 2000 10:08 PM</FONT>
<BR><B><FONT SIZE=3D1 =
FACE=3D"Arial">To:&nbsp;&nbsp;&nbsp;&nbsp;</FONT></B> <FONT SIZE=3D1 =
FACE=3D"Arial">Zhao, David&nbsp; [PRI]</FONT>
<BR><B><FONT SIZE=3D1 =
FACE=3D"Arial">Cc:&nbsp;&nbsp;&nbsp;&nbsp;</FONT></B> <FONT SIZE=3D1 =
FACE=3D"Arial">'Bio-Perl'</FONT>
<BR><B><FONT SIZE=3D1 =
FACE=3D"Arial">Subject:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT>=
</B> <FONT SIZE=3D1 FACE=3D"Arial">Re: [Bioperl-l] =
Bio::Tools::Blast::HTML questions</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">Zhao, David [PRI] =
&lt;DZhao1@prius.jnj.com&gt; said:</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; It seems that nobody has had the =
same problem, or you guy think</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; this is just not significant =
enough to be answered.</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">Actually, there could be several other =
reasons.&nbsp; For example, almost</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">the only time I get HTML formatted =
mail is from junk/spam mail, so</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">I have an almost instinctual urge to =
delete those mails when I see</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">them.&nbsp; Since your message in no =
way needed the extra abilities of</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">HTML, you should have used ASCII =
instead.</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">Second, it was hard to figure out what =
the problem was, if just</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">given your description.&nbsp; A more =
helpful report might have been</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">] Hi there,</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">]&nbsp;&nbsp; It seems the HTML =
module doesn't recognize the genbank format in</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">] the summary table.&nbsp; The lines =
look like:</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">]</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">] dbj|AU027194.1|AU027194 Rattus =
norvegicus, OTSUKA clone, OT17.21...&nbsp;&nbsp;&nbsp; 52</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">4e-06</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">]</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">] When I replace the &quot;.1&quot; =
in &quot;AU027194.1&quot; with &quot;AU027194&quot; it works.</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">] Here's the full table:</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">] ...</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">It isn't much harder to write than =
your original email, but it is much</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">easier for someone else to =
understand.</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">It would also be helpful to know what =
&quot;doesn't recognize&quot; means.&nbsp; Does</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">it stop with an error?&nbsp; Is the =
line just ignored?&nbsp; Is the rest of the</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">input file ignored?</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">The better a bug report is, the more =
likely it will be answered.&nbsp; But</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">it takes effort and practice to learn =
how to write a good report.</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">Third, given that it's been over 24 =
hours, you could have messed around</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">with the code yourself.&nbsp; A good =
bug report can almost guide you to where to</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">look in the code.</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">In this case, that's the code for =
parsing the summary table, most likely</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">related to parsing genbank =
lines.&nbsp; From a quick perusal, the problem would</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">likely be in the section:</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">## REGEXPS FOR SUMMARY TABLE LINES AT =
TOP OF REPORT (a.k.a. 'descriptions')</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">## (table of sequence id, =
description, score, P/Expect value, n)</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">##</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">## Not using bold face to highlight =
the sequence id's since this can throw</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">off</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">## off formatting of the line when =
the IDs are different lengths. This lead</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">to</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">## the scores and P/Expect values not =
lining up properly.</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">&nbsp;&nbsp;&nbsp; ### NCBI-specific =
markups for description lines:</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">&nbsp; # GenBank/EMBL, DDBJ hits =
(GenBank Format):</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&nbsp; s@^ =
?(gb|emb|dbj)\|($Word)(\|$Word)?($Descrip)($Int =
+)($Signif)(.*)$@$1:&lt;a</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">href=3D</FONT>
<BR><FONT SIZE=3D2 =
FACE=3D"Arial">&quot;$DbUrl{'gb_n'}$2&quot;&gt;$2$3&lt;/a&gt;$4$5&lt;A =
href=3D&quot;\#$2_A&quot;&gt;$6&lt;/a&gt;$7&lt;a</FONT>
<BR><FONT SIZE=3D2 =
FACE=3D"Arial">name=3D&quot;$2_H&quot;&gt;&lt;/a&gt;@o;</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">It wasn't very hard to find this =
code.</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">If you look at the definition of =
&quot;Word&quot; you'll see it is defined as &quot;[\w_.]&quot;</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">so this pattern *should* match the =
data line you give.</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">So in a followup email you could =
describe your hypothesis of the problem</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">and what you've done to track it =
down.</FONT>
</P>
<BR>

<P><FONT SIZE=3D2 FACE=3D"Arial">Fourth, given that the code is =
correct, you could check the CVS logs,</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">available even to anonymous external =
users, and see</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">revision 1.3.2.1</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">date: 2000/05/18 20:53:31;&nbsp; =
author: sac;&nbsp; state: Exp;&nbsp; lines: +6 -4</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">- The $Word and $Acc strings now =
include '.' to accomodate accessions with</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&nbsp; version number. Word also =
allows '_'&nbsp; to work with ref seq accessions.</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">- Silencing warnings during =
_markup_report.</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">Checking bioperl-0.6.1.tar.gz (with =
the file datestamp on the ftp site of</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">May 19, 2000, so the day after =
Steve's fix) you'll see that the code</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">contains the fix mentioned in the CVS =
log.</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">So the answer to your =
statement:</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; It seems that nobody has had the =
same problem, or you guy think</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; this is just not significant =
enough to be answered.</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">is that it has been seen, corrected, =
and distributed almost 4 months</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">ago, so nobody has the problem.&nbsp; =
You need to update your distribution.</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">Also, I'll bet that only 2 or 3 =
people ever saw the bug before it was</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">fixed, so most of the people on the =
list really have not ever seen</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">the problem and could not answer your =
email without spending non-trivial</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">time digging through the back =
logs.</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">This also means that if you are =
submitting a bug report, you do need</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">to include the version number in =
which you found the problem.</FONT>
</P>

<P><FONT SIZE=3D2 =
FACE=3D"Arial">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nb=
sp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Andrew =
Dalke</FONT>
<BR><FONT SIZE=3D2 =
FACE=3D"Arial">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nb=
sp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
dalke@acm.org</FONT>
</P>
<BR>
<BR>

<P><FONT SIZE=3D2 =
FACE=3D"Arial">_______________________________________________</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">Bioperl-l mailing list</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">Bioperl-l@bioperl.org</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial"><A =
HREF=3D"http://bioperl.org/mailman/listinfo/bioperl-l" =
TARGET=3D"_blank">http://bioperl.org/mailman/listinfo/bioperl-l</A></FON=
T>
</P>

</BODY>
</HTML>
------_=_NextPart_001_01C02194.C49AF560--