[BioPython] Parsing bl2seq output

Ying Li cyli at MIT.EDU
Tue Jun 17 16:01:30 EDT 2003


Hi!

I'm just starting out with biopython, and I was wondering if there was 
a parser that parses bl2seq output.  I thought BlastParser would work, 
but when I get an unexpected end-of-stream error.  If someone could 
tell me if I'm doing something wrong, or if I'm not then where a bl2seq 
parser is, or whether one is in biopython, I'd really appreciate it.

Thanks so much!
-Ying

The code I used, the error I received, and the output I tried to parse 
are as below:

code:
---------
from Bio.Blast import NCBIStandalone
parser = NCBIStandalone.BlastParser()
handle = open("output", 'r')
rec = parser.parse(handle)
---------


error:
---------
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
   File "/sw/lib/python2.2/site-packages/Bio/Blast/NCBIStandalone.py", 
line 611, in parse
     self._scanner.feed(handle, self._consumer)
   File "/sw/lib/python2.2/site-packages/Bio/Blast/NCBIStandalone.py", 
line 81, in feed
     read_and_call_until(uhandle, consumer.noevent, contains='BLAST')
   File "/sw/lib/python2.2/site-packages/Bio/ParserSupport.py", line 
366, in read_and_call_until
     line = safe_readline(uhandle)
   File "/sw/lib/python2.2/site-packages/Bio/ParserSupport.py", line 
442, in safe_readline
     raise SyntaxError, "Unexpected end of stream."
SyntaxError: Unexpected end of stream.



output from bl2seq:
----------
Query=
          (443 letters)

 >
           Length = 373

  Score = 20.0 bits (40), Expect = 0.13
  Identities = 22/61 (36%), Positives = 29/61 (47%), Gaps = 17/61 (27%)

Query: 32  SELIKIS----NTEFVILVRSNLGVTILN--EFKEV-FV----------YEFKSVLNSYV 
74
            SE + +S    N   +I  +SNL V+ILN  EF  + FV           EFKS L   +
Sbjct: 88  SEFVTLSTFAENELEIITEKSNLKVSILNVEEFPLIGFVENGLELSIDSQEFKSTLTQTI 
147

Query: 75  S 75
            S
Sbjct: 148 S 148



  Score = 18.5 bits (36), Expect = 0.38
  Identities = 33/165 (20%), Positives = 76/165 (46%), Gaps = 29/165 
(17%)

Query: 257 FNSGLSTPINALDIPTAKLIIEAEIKKQGLKQKIKEDAVVYLAQN-FSDDVRKIKGLVNR 
315
            F S L+  I++++    K+++       G+  KIK++ + ++  + F   +++I
Sbjct: 139 FKSTLTQTISSINEWNQKVVLA------GMNLKIKDNKISFVTTDLFRVSLKEI------ 
186

Query: 316 LLFFGIQNDLGHIIDLEDVIDLFKDTPSANLGLLNVKKIKEVVAKKYDVTIKAIDGKART 
375
            +L      ++  II  + +I+L       NL + NVK+ K ++    + T K +D
Sbjct: 187 ILNEATNQEVDIIIPYKTLIEL------RNL-IENVKEFK-IIIHDTNATFK-LDNDLLQ 
237

Query: 376 TAIKNARHLSMYFAKIILNHTSTQIGAEFGGRDHSTVLSAISRIE 420
            + + + R+ +++ A    +    ++ A+       T+L  +SR E
Sbjct: 238 STLIDGRYPNVHSAFPTTHEIKLELKAK-------TLLKVLSRFE 275



  Score = 15.4 bits (28), Expect = 3.3
  Identities = 5/11 (45%), Positives = 7/11 (63%)

Query: 423 IYKEKEFKKIV 433
            +Y E  F K+V
Sbjct: 33  VYMEVSFDKLV 43


Lambda     K      H
    0.317    0.136    0.361

Gapped
Lambda     K      H
    0.267   0.0410    0.140


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Hits to DB: 459
Number of Sequences: 0
Number of extensions: 18
Number of successful extensions: 4
Number of sequences better than 10.0: 1
Number of HSP's better than 10.0 without gapping: 1
Number of HSP's successfully gapped in prelim test: 0
Number of HSP's that attempted gapping in prelim test: 0
Number of HSP's gapped (non-prelim): 4
length of query: 443
length of database: 373
effective HSP length: 32
effective length of query: 411
effective length of database: 341
effective search space:   140151
effective search space used:   140151
T: 11
A: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 24 (13.8 bits)
S2: 24 (13.9 bits)



More information about the BioPython mailing list