[Biopython] hsp.identities

Mon Aug 10 20:43:30 UTC 2009

On Mon, Aug 10, 2009 at 5:59 PM, Mike Williams<dmikewilliams at gmail.com> wrote:
> Hi there.  Been using perl since 1996, but I am new to python.  I am
> working on some python code that was last modified in March of 2007.
>
> The code used to use NCBIStandalone, I've modified it to use NCBIXML
> because the Standalone package died with an exception, which I assumed
> was due to changes in the blast report format since the code was
> originally written.

Quite likely - the NCBI keep changing the plain text output, so we have
more or less given up that losing battle and have followed their advice
and now just recommend the XML parser.

> <snippet>
> blastToleranceNT = 2
> blast_out = open(blast_report, "r")
> b_parse = NCBIXML.parse(blast_out, debug)
> for b_record in b_parse :
>    for al in b_record.alignments:
>        al.hsps = filter (lambda x:
> abs(x.identities[0]-x.identities[1]) <= blastToleranceNT,
>                                  al.hsps)
> </snippet>
>
> This code generates the following error:
> TypeError: 'int' object is unsubscriptable
> ...
> I've looked at various sites with examples of how to deal with tuples,
> but nothing seems to work, and
> the error messages always imply that identities is an int.
>
> I'm hoping my spinning my wheels on this is just the result of being
> new to python.  I know the original version of the code *used* to
> work, and the rest of the program seems to work fine, if I comment out
> the filter line.
>
> Any help would be appreciated, this one line of code is a show stopper
> and I have multiple deadlines this week which depend on getting this
> working.

This is one of the quirks of the XML parser (integer) versus the
plain text parser (tuple of two integers, the number of identities
and the alignment length). In general they are interchangeable
but there are a couple of accidents like this which we've left in
place rather than breaking existing scripts. See Bug 2176 for
more details.
http://bugzilla.open-bio.org/show_bug.cgi?id=2176

For plain text, from memory you needed this:
abs(x.identities[0]-x.identities[1]) or, abs(x.identities[0]-x.align_length)
For XML you'll need: abs(x.identities - x.align_length)

(I think, without testing it)

Peter