[BioPython] Parsing bl2seq output
Ying Li
cyli at MIT.EDU
Tue Jun 17 16:01:30 EDT 2003
Hi!
I'm just starting out with biopython, and I was wondering if there was
a parser that parses bl2seq output. I thought BlastParser would work,
but when I get an unexpected end-of-stream error. If someone could
tell me if I'm doing something wrong, or if I'm not then where a bl2seq
parser is, or whether one is in biopython, I'd really appreciate it.
Thanks so much!
-Ying
The code I used, the error I received, and the output I tried to parse
are as below:
code:
---------
from Bio.Blast import NCBIStandalone
parser = NCBIStandalone.BlastParser()
handle = open("output", 'r')
rec = parser.parse(handle)
---------
error:
---------
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/sw/lib/python2.2/site-packages/Bio/Blast/NCBIStandalone.py",
line 611, in parse
self._scanner.feed(handle, self._consumer)
File "/sw/lib/python2.2/site-packages/Bio/Blast/NCBIStandalone.py",
line 81, in feed
read_and_call_until(uhandle, consumer.noevent, contains='BLAST')
File "/sw/lib/python2.2/site-packages/Bio/ParserSupport.py", line
366, in read_and_call_until
line = safe_readline(uhandle)
File "/sw/lib/python2.2/site-packages/Bio/ParserSupport.py", line
442, in safe_readline
raise SyntaxError, "Unexpected end of stream."
SyntaxError: Unexpected end of stream.
output from bl2seq:
----------
Query=
(443 letters)
>
Length = 373
Score = 20.0 bits (40), Expect = 0.13
Identities = 22/61 (36%), Positives = 29/61 (47%), Gaps = 17/61 (27%)
Query: 32 SELIKIS----NTEFVILVRSNLGVTILN--EFKEV-FV----------YEFKSVLNSYV
74
SE + +S N +I +SNL V+ILN EF + FV EFKS L +
Sbjct: 88 SEFVTLSTFAENELEIITEKSNLKVSILNVEEFPLIGFVENGLELSIDSQEFKSTLTQTI
147
Query: 75 S 75
S
Sbjct: 148 S 148
Score = 18.5 bits (36), Expect = 0.38
Identities = 33/165 (20%), Positives = 76/165 (46%), Gaps = 29/165
(17%)
Query: 257 FNSGLSTPINALDIPTAKLIIEAEIKKQGLKQKIKEDAVVYLAQN-FSDDVRKIKGLVNR
315
F S L+ I++++ K+++ G+ KIK++ + ++ + F +++I
Sbjct: 139 FKSTLTQTISSINEWNQKVVLA------GMNLKIKDNKISFVTTDLFRVSLKEI------
186
Query: 316 LLFFGIQNDLGHIIDLEDVIDLFKDTPSANLGLLNVKKIKEVVAKKYDVTIKAIDGKART
375
+L ++ II + +I+L NL + NVK+ K ++ + T K +D
Sbjct: 187 ILNEATNQEVDIIIPYKTLIEL------RNL-IENVKEFK-IIIHDTNATFK-LDNDLLQ
237
Query: 376 TAIKNARHLSMYFAKIILNHTSTQIGAEFGGRDHSTVLSAISRIE 420
+ + + R+ +++ A + ++ A+ T+L +SR E
Sbjct: 238 STLIDGRYPNVHSAFPTTHEIKLELKAK-------TLLKVLSRFE 275
Score = 15.4 bits (28), Expect = 3.3
Identities = 5/11 (45%), Positives = 7/11 (63%)
Query: 423 IYKEKEFKKIV 433
+Y E F K+V
Sbjct: 33 VYMEVSFDKLV 43
Lambda K H
0.317 0.136 0.361
Gapped
Lambda K H
0.267 0.0410 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Hits to DB: 459
Number of Sequences: 0
Number of extensions: 18
Number of successful extensions: 4
Number of sequences better than 10.0: 1
Number of HSP's better than 10.0 without gapping: 1
Number of HSP's successfully gapped in prelim test: 0
Number of HSP's that attempted gapping in prelim test: 0
Number of HSP's gapped (non-prelim): 4
length of query: 443
length of database: 373
effective HSP length: 32
effective length of query: 411
effective length of database: 341
effective search space: 140151
effective search space used: 140151
T: 11
A: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 24 (13.8 bits)
S2: 24 (13.9 bits)
More information about the BioPython
mailing list