[BioPython] BLASTParser

Suraj Peri suraj_peri at yahoo.com
Wed Jun 18 18:55:52 EDT 2003


Dear Iddo, 
thanks for welcoming me.  I could parse my data but 
now I am more charged and raising my need levels.

Now, I am unable to get the name of the query that I
blasted.  I tried instantiating the Header class and
then tried to get the value attached to query
attribute in the header class.  I tried pulling out
value from hsp.query , here I got the sequence and not
the name of the sequence.  I tried different ways:

==> one such way is:

from Bio.Blast.Record import Header
query = header()

Could any one  please help me to pull out the query
name:
Also, please excuse me if I am messing up some thing
here. I am reading books and object-oriented concepts
are not yet imbibed.

thanks
-Suraj.




--- Iddo Friedberg <idoerg at burnham.org> wrote:
> 
> 
> Suraj Peri wrote:
> > Hi group, 
> >  
> >  I started learning python recently. I am very
> much
> > excited to parse my blast output using biopython
> > modules. 
> 
> 
> Welcome aboard! The more the merrier!
> 
> You need to install mxTextTools, available from:
> 
> http://www.lemburg.com/files/python/mxTextTools.html
> 
> Try then.
> 
> Best,
> 
> Iddo
> 
> > 
> > I used the code provided in the biopyton tutorial
> to
> > parse my blast resutls.
> > I have a 50 fasta formatted sequences and I
> blasted
> > them against RefSeq database locally.  So I used
> the
> > Iterator function (please correct me if this is
> not
> > correct).
> > 
> > Now I have two problems:
> > 
> > 1. I cannot execute my testparser.py script with
> the
> > following content:
> > 
> > import os
> > from Bio.Blast import NCBIStandalone
> > b_out = open('kinasesrefseqout','r')
> > b_parser = NCBIStandalone.BlastParser()
> > b_iterator = NCBIStandalone.Iterator (b_out,
> b_parser)
> > b_record = b_iterator.next()
> > 
> > while 1:
> >          b_record = b_iterator.next()
> >          if b_record is None:
> >                  break
> >          E_VALUE_THRESH = 0.00
> >          for alignment in b_record.alignments:
> >                  for hsp in alignment.hsps:
> >                          if hsp.expect <
> > E_VALUE_THRESH:
> >                                  print
> 'Sequence:',
> > alignment.title
> >                                  print 'e value:',
> > hsp.expect
> >                                  if len(hsp.query)
> >
> > 75:
> >                                          dots =
> '...'
> >                                  else:
> >                                          dots = ''
> >                                  print hsp.query
> > [0:75] + dots
> >                                  print hsp.match
> > [0:75] + dots
> >                                  print hsp.sbjct
> > [0:75] + dots
> > 
> > 
> > I get the following error:
> > 
> > Traceback (most recent call last):
> >   File "ptpparser.py", line 3, in ?
> >     from Bio.Blast import NCBIStandalone
> >   File "Bio/__init__.py", line 65, in ?
> >     _load_registries()
> >   File "Bio/__init__.py", line 57, in
> _load_registries
> >     module = __import__("Bio.config.%s" % module,
> {},
> > {}, ["Bio","config"])
> >   File "Bio/config/DBRegistry.py", line 33, in ?
> >     from Martel import Parser
> >   File "Martel/__init__.py", line 6, in ?
> >     import Expression
> >   File "Martel/Expression.py", line 33, in ?
> >     import Parser
> >   File "Martel/Parser.py", line 33, in ?
> >     import TextTools
> > ImportError: No module named TextTools
> > 
> > 
> > 2. When I try this directly in the interactive
> mode,
> > instead of getting a complete list of what I asked
> it
> > to 'print', I get the following:
> > 
> > seqience: >ref|NG_001337.1| Homo sapiens T cell
> > receptor beta variable orphans on chromosome
> > 9(TRBVOR9@) on chromosome 9
> > 
> > 
> > Thats it. I expected a list of all sequences with
> <
> > 0.00 E-Value. I get only one. Simply the iterator
> > function failed in my case.
> > 
> > Could any one please help me how to get the
> desired
> > output. 
> > 
> > Thanks.
> > 
> > 
> > 
> > 
> > =====
> > PIL/BMB/SDU/DK
> > 
> > __________________________________
> > Do you Yahoo!?
> > SBC Yahoo! DSL - Now only $29.95 per month!
> > http://sbc.yahoo.com
> > _______________________________________________
> > BioPython mailing list  -  BioPython at biopython.org
> > http://biopython.org/mailman/listinfo/biopython
> > 
> > 
> 
> -- 
> Iddo Friedberg, Ph.D.
> The Burnham Institute
> 10901 N. Torrey Pines Rd.
> La Jolla, CA 92037
> USA
> Tel: +1 (858) 646 3100 x3516
> Fax: +1 (858) 646 3171
> http://ffas.ljcrf.edu/~iddo
> 


=====
Suraj Peri
School of Medicine
Johns Hopkins University
Baltimore MD 21287

__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com


More information about the BioPython mailing list