[Biopython-dev] NCBIStandalone Blast HSP parsing
Yair Benita
y.benita at wanadoo.nl
Mon Oct 17 19:45:47 EDT 2005
Hi Michael,
This issue has already been fixed. In the last review of NCBIstandalone I
made with Jeff Chang the query_end and sbjct_end were added.
Just grab the latest NCBIstandalone version from CVS.
Yair
> From: Mark Hoebeke <Mark.Hoebeke at jouy.inra.fr>
> Organization: INRA - MIA
> Date: Mon, 17 Oct 2005 16:07:13 +0200
> To: <biopython-dev at biopython.org>
> Subject: [Biopython-dev] NCBIStandalone Blast HSP parsing
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi all,
>
> I wanted a quick and easy way to determine the endpoints of HSPs extraced from
> Blast reports parser with NCBIStandalone. Unfortunately the HSP class lacks
> the
> query_end and sbjct_end attributes. Googling around led me to a recipe
> describing how to compute the endpoint using the total length, gap length and
> other niceties. Not exactly intuitive to me.
>
> Hence I dove into the NCBIStandalone and HSP modules and made some slight
> modifications. Basically I added the two attributes to HSP and the following
> snippets to NCBIStandalone (release 1.4b):
>
> 972c972
> < _query_re = re.compile(r"Query: (\d+)\s*(.+) (\d+)")
> - ---
>> _query_re = re.compile(r"Query: (\d+)\s*(.+) \d")
> 977,978c977
> < start, seq, end = m.groups()
> < self._hsp.query_end=string.atoi(end);
> - ---
>> start, seq = m.groups()
> 997,998c996,997
> < start, seq, end = _re_search(
> < r"Sbjct: (\d+)\s*(.+) (\d+)", line,
> - ---
>> start, seq = _re_search(
>> r"Sbjct: (\d+)\s*(.+) \d", line,
> 1014c1013
> < self._hsp.sbjct_end=string.atoi(end)
> - ---
>>
>
> Looks to easy to be true, I thought. Now sorry if I'm missing some important
> issues here (I'm quite new to BioPython), but is there a reason no one has
> made
> this patch yet ?
>
> Thanks for any comments (flames and others.)
>
> Cheers,
>
> Mark
>
>
> - --
> - ----------------------------Mark.Hoebeke at jouy.inra.fr-----------------------
> Unité Statistique & Génome _/_/_/ _/_/_/ http://stat.genopole.cnrs.fr
> Tél : +33 (0)1 60 87 38 03 _/ _/ Fax : +33 (0)1 60 87 38 09
> Tour Evry 2, _/_/ _/ _/_/ 523, pl. des Terrasses
> F-91000, _/ _/ _/ Evry
> PGP : A2AD52E3 _/_/_/ _/_/_/
>
>
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.2 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
>
> iD8DBQFDU7ARa3nTV6KtUuMRArBqAKC/m4i+VpVaU3clvOkMuYkfRrZQ+QCfbRKg
> gBBW5wNKS3sb/Uqr31eumx8=
> =vSWV
> -----END PGP SIGNATURE-----
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at biopython.org
> http://biopython.org/mailman/listinfo/biopython-dev
>
More information about the Biopython-dev
mailing list