[Bioperl-l] Some more troubles with HTML module?

Carl Virtanen carl@cimmed.com
Wed, 25 Oct 2000 18:17:51 +1000


This is a multi-part message in MIME format.

------=_NextPart_000_0146_01C03EAF.E2CA94E0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Hi folks,

I'm a little new at checking out some of this stuff, so please bear with =
me. I'm using bioperl 6.2.
The problem i'm having is that the output from the Blast->to_html =
routine is not picking up all of the correct references and 'htmlifying' =
them (see my example near the bottom).  I'm just using the standard =
kinda usage:
use Bio::Tools::Blast qw(:obj);
$Blast->to_html(file=3D>$ARGV[0]);

 I've narrowed the problem down to the HTML.pm module.  Now, call me a =
bonehead (if you wish, but that wouldn't be really nice now would it?) =
but the regexps in there are some real bad ass ones (if you'll excuse my =
colourful explanation)!
So tracking down where the problem is is not so easy for me.  Actually, =
if somebody would explain to me at least one of the regexps,for example:

s@^ ?(gb|emb|dbj)\|($Word)(\|$Word)?($Descrip)($Int =
+)($Signif)(.*)$@$1:<a hre
f=3D"$DbUrl{'gb_n'}$2">$2$3</a>$4$5<A href=3D"\#$2_A">$6</a>$7<a =
name=3D"$2_H"></a>@o;


then i would be very grateful and would even try to track down the =
problem myself and possibly contribute a little to all of this. I'm =
familiar with basic regexps/substitution and so on, but yikes!

Anyways, here's the output, and you can see that it's missing a bunch of =
gi's. The search was just a routine peek at some proteins in the nr =
database:

Sequences producing significant alignments:                        =
(bits)  Value

emb|CAB55683.1| (AL035427) dJ769N13.1 (KIAA0443 protein.) [Homo ...   =
214  2e-54
ref|NP_055525.1| KIAA0443 gene product >gi|7512985|pir||T00068 h...   =
214  2e-54
dbj|BAB14367.1| (AK023031) unnamed protein product [Homo sapiens]     =
181  2e-44
gb:AAF64273.1|AF208859_1 (AF208859) BM-017 [Homo sapiens] >gi|82...   =
123  6e-27



Thanks!

Carl Virtanen



------=_NextPart_000_0146_01C03EAF.E2CA94E0
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHTML 5.50.4134.600" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2>Hi folks,</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>I'm a little new at checking out some =
of this=20
stuff, so please bear with me. I'm using bioperl 6.2.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>The problem i'm having is that the =
output from the=20
Blast-&gt;to_html routine is not picking up all of the correct =
references and=20
'htmlifying' them (see my example near the bottom).&nbsp; I'm just using =
the=20
standard kinda usage:</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>use Bio::Tools::Blast =
qw(:obj);</FONT></DIV>
<DIV><FONT face=3DArial =
size=3D2>$Blast-&gt;to_html(file=3D&gt;$ARGV[0]);</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>&nbsp;I've narrowed the problem down to =
the HTML.pm=20
module.&nbsp; Now, call me a bonehead (if you wish, but that wouldn't be =
really=20
nice now would it?) but the regexps in there are some real bad ass ones =
(if=20
you'll excuse my colourful explanation)!</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>So tracking down where the problem is =
is&nbsp;not=20
so easy for me.&nbsp; Actually, if somebody would explain to me at least =
one of=20
the regexps,</FONT><FONT face=3DArial size=3D2>for example:</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>s@^ =
?(gb|emb|dbj)\|($Word)(\|$Word)?($Descrip)($Int=20
+)($Signif)(.*)$@$1:&lt;a =
hre<BR>f=3D"$DbUrl{'gb_n'}$2"&gt;$2$3&lt;/a&gt;$4$5&lt;A=20
href=3D"\#$2_A"&gt;$6&lt;/a&gt;$7&lt;a =
name=3D"$2_H"&gt;&lt;/a&gt;@o;</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>then i would be very grateful and would =
even try to=20
track down the problem myself and possibly contribute a little to all of =
this.=20
I'm familiar with basic regexps/substitution and so on, but =
yikes!</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>Anyways, here's the output, and you can =
see that=20
it's missing a bunch of gi's. The search was just a routine peek at some =

proteins in the nr database:</FONT></DIV>
<DIV><BR>Sequences producing significant=20
alignments:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nb=
sp;=20
(bits)&nbsp; Value<BR><BR>emb|CAB55683.1| (AL035427) dJ769N13.1 =
(KIAA0443=20
protein.) [Homo ...&nbsp;&nbsp; 214&nbsp; 2e-54<BR>ref|NP_055525.1| =
KIAA0443=20
gene product &gt;gi|7512985|pir||T00068 h...&nbsp;&nbsp; 214&nbsp;=20
2e-54<BR>dbj|BAB14367.1| (AK023031) unnamed protein product [Homo=20
sapiens]&nbsp;&nbsp;&nbsp;&nbsp; 181&nbsp; 2e-44<BR>gb:<A=20
href=3D"http://www3.ncbi.nlm.nih.gov/htbin-post/Entrez/query?db=3Dn&amp;f=
orm=3D6&amp;dopt=3Dg&amp;uid=3DAAF64273.1">AAF64273.1|AF208859_1</A>=20
(AF208859) BM-017 [Homo sapiens] &gt;gi|82...&nbsp;&nbsp; 123&nbsp; <A=20
href=3D"http://zeus/~carl/FJtest.html#AAF64273.1_A">6e-27</A><A=20
name=3DAAF64273.1_H></A><BR></DIV>
<DIV>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>Thanks!</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>Carl =
Virtanen</FONT></DIV><PRE>&nbsp;</PRE></BODY></HTML>

------=_NextPart_000_0146_01C03EAF.E2CA94E0--