[BioPython] BLASTParser
Suraj Peri
suraj_peri at yahoo.com
Wed Jun 18 18:55:52 EDT 2003
Dear Iddo,
thanks for welcoming me. I could parse my data but
now I am more charged and raising my need levels.
Now, I am unable to get the name of the query that I
blasted. I tried instantiating the Header class and
then tried to get the value attached to query
attribute in the header class. I tried pulling out
value from hsp.query , here I got the sequence and not
the name of the sequence. I tried different ways:
==> one such way is:
from Bio.Blast.Record import Header
query = header()
Could any one please help me to pull out the query
name:
Also, please excuse me if I am messing up some thing
here. I am reading books and object-oriented concepts
are not yet imbibed.
thanks
-Suraj.
--- Iddo Friedberg <idoerg at burnham.org> wrote:
>
>
> Suraj Peri wrote:
> > Hi group,
> >
> > I started learning python recently. I am very
> much
> > excited to parse my blast output using biopython
> > modules.
>
>
> Welcome aboard! The more the merrier!
>
> You need to install mxTextTools, available from:
>
> http://www.lemburg.com/files/python/mxTextTools.html
>
> Try then.
>
> Best,
>
> Iddo
>
> >
> > I used the code provided in the biopyton tutorial
> to
> > parse my blast resutls.
> > I have a 50 fasta formatted sequences and I
> blasted
> > them against RefSeq database locally. So I used
> the
> > Iterator function (please correct me if this is
> not
> > correct).
> >
> > Now I have two problems:
> >
> > 1. I cannot execute my testparser.py script with
> the
> > following content:
> >
> > import os
> > from Bio.Blast import NCBIStandalone
> > b_out = open('kinasesrefseqout','r')
> > b_parser = NCBIStandalone.BlastParser()
> > b_iterator = NCBIStandalone.Iterator (b_out,
> b_parser)
> > b_record = b_iterator.next()
> >
> > while 1:
> > b_record = b_iterator.next()
> > if b_record is None:
> > break
> > E_VALUE_THRESH = 0.00
> > for alignment in b_record.alignments:
> > for hsp in alignment.hsps:
> > if hsp.expect <
> > E_VALUE_THRESH:
> > print
> 'Sequence:',
> > alignment.title
> > print 'e value:',
> > hsp.expect
> > if len(hsp.query)
> >
> > 75:
> > dots =
> '...'
> > else:
> > dots = ''
> > print hsp.query
> > [0:75] + dots
> > print hsp.match
> > [0:75] + dots
> > print hsp.sbjct
> > [0:75] + dots
> >
> >
> > I get the following error:
> >
> > Traceback (most recent call last):
> > File "ptpparser.py", line 3, in ?
> > from Bio.Blast import NCBIStandalone
> > File "Bio/__init__.py", line 65, in ?
> > _load_registries()
> > File "Bio/__init__.py", line 57, in
> _load_registries
> > module = __import__("Bio.config.%s" % module,
> {},
> > {}, ["Bio","config"])
> > File "Bio/config/DBRegistry.py", line 33, in ?
> > from Martel import Parser
> > File "Martel/__init__.py", line 6, in ?
> > import Expression
> > File "Martel/Expression.py", line 33, in ?
> > import Parser
> > File "Martel/Parser.py", line 33, in ?
> > import TextTools
> > ImportError: No module named TextTools
> >
> >
> > 2. When I try this directly in the interactive
> mode,
> > instead of getting a complete list of what I asked
> it
> > to 'print', I get the following:
> >
> > seqience: >ref|NG_001337.1| Homo sapiens T cell
> > receptor beta variable orphans on chromosome
> > 9(TRBVOR9@) on chromosome 9
> >
> >
> > Thats it. I expected a list of all sequences with
> <
> > 0.00 E-Value. I get only one. Simply the iterator
> > function failed in my case.
> >
> > Could any one please help me how to get the
> desired
> > output.
> >
> > Thanks.
> >
> >
> >
> >
> > =====
> > PIL/BMB/SDU/DK
> >
> > __________________________________
> > Do you Yahoo!?
> > SBC Yahoo! DSL - Now only $29.95 per month!
> > http://sbc.yahoo.com
> > _______________________________________________
> > BioPython mailing list - BioPython at biopython.org
> > http://biopython.org/mailman/listinfo/biopython
> >
> >
>
> --
> Iddo Friedberg, Ph.D.
> The Burnham Institute
> 10901 N. Torrey Pines Rd.
> La Jolla, CA 92037
> USA
> Tel: +1 (858) 646 3100 x3516
> Fax: +1 (858) 646 3171
> http://ffas.ljcrf.edu/~iddo
>
=====
Suraj Peri
School of Medicine
Johns Hopkins University
Baltimore MD 21287
__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com
More information about the BioPython
mailing list