[BioPython] More objects on BLAST parser
Jeffrey Chang
jeffrey_chang at stanfordalumni.org
Tue Jun 8 00:20:25 EDT 2004
On Jun 7, 2004, at 11:54 PM, Sebastian Bassi wrote:
> Hello,
>
> I am working with this code:
>
> from Bio.Blast import NCBIStandalone
> b_parser = NCBIStandalone.BlastParser()
> bl=open("C:\\bioinfo-adv\\primervsclones\\MS1248-For","r")
> b_record= b_parser.parse(bl)
> for alignment in b_record.alignments:
> for hsp in alignment.hsps:
> print alignment.title
> print alignment.length
> print hsp.expect
> bl.close()
>
> Works great. Now I need to know what other objects could I retrieve
> with this parser. For example I'd like to retrieve:
> The lenght of the input/query sequence.
b_record.query_letters
> The lenght of the hit sequence.
This one is hard because it does not show up in the BLAST record. It
prints out only the piece of the subject that has high sequence
identity with the query sequence. For BLAST2, you may be able to
calculate the length of an alignment based on the subject sequence
(sbjct) or residues (sbjct_start) of the HSPs. Are you sure you really
need the length of this sequence?
> The identities (like 20/21).
hsp.identities
> The name of the query sequence.
b_record.query
> Another question: Is there a way (looking at the source code) to know
> all the available objects?
One way without looking at the source code is to look at the docstring
for the objects, e.g. help(b_record), help(alignment). In the source,
these classes are in Bio/Blast/Record.py. In the file, each of the
member variables are documented in the docstring. I believe that for
the standalone version of blast, we're parsing out every bit of
information. So if it's in the output, it's in the object somewhere!
;)
Jeff
More information about the BioPython
mailing list