[Biopython] how to get the hit length from Bio.Blast.NCBIXML?
Ann Loraine
aloraine at gmail.com
Sun Mar 7 14:55:19 UTC 2010
Hello,
I'm using Bio.Blast.NCBIXML to parse blastx results for an annotation
project. I'm searching contig consensus sequences (assembled from 454
reads) against a protein database.
Since these are assembled ESTs and may be incomplete, I need to know
how much of a matched sequence was included in the alignment so that I
can compute the percent coverage of both the hit and query.
How do I retrieve the "hit length" from the objects returned by the parser?
I couldn't find anything in the record and alignment objects that
contains this information -- if it is not there, should it be added?
The hit length appears in the XML:
*cut*
<Iteration>
<Iteration_iter-num>3</Iteration_iter-num>
<Iteration_query-ID>lcl|3_0</Iteration_query-ID>
<Iteration_query-def>Both_1_c25003</Iteration_query-def>
<Iteration_query-len>422</Iteration_query-len>
<Iteration_hits>
<Hit>
<Hit_num>1</Hit_num>
<Hit_id>gnl|BL_ORD_ID|12864</Hit_id>
<Hit_def>gi|255551002|ref|XP_002516549.1| catalytic,
putative [Ricinus communis]</Hit_def>
<Hit_accession>12864</Hit_accession>
<Hit_len>431</Hit_len>
<Hit_hsps>
<Hsp>
<Hsp_num>1</Hsp_num>
<Hsp_bit-score>112.079</Hsp_bit-score>
*paste*
Best,
Ann Loraine
More information about the Biopython
mailing list