blastn
thomas at cbs.dtu.dk
thomas at cbs.dtu.dk
Sat Aug 5 05:05:11 EDT 2000
Full_Name: thomas sichertiz-ponten
Module: Blast/NCBIStandalone
Version:
OS: linux, IRIX
Submission from: molev106.ebc.uu.se (130.238.82.106)
Problem:
cannot parse a multiple blastnresult because of
?hardcoded? amount of whitespaces ?
#script .....
import sys, os
sys.path.insert(0, os.path.expanduser('~thomas/cbs/python/biopython'))
from Bio.Blast import NCBIStandalone
from Bio.Data import IUPACData
file = 'blasttest.blastn'
parser = NCBIStandalone.BlastParser()
iter = NCBIStandalone.Iterator(handle = open(file), parser = parser)
while 1:
res = iter.next()
---- SNIP ----- SNIP ------
# result
Traceback (innermost last):
File "<stdin>", line 1, in ?
File "/usr/tmp/python-Oq3ztf", line 18, in ?
res = iter.next()
File "/home/genome6/thomas/cbs/python/biopython/Bio/Blast/NCBIStandalone.py",
line 1199, in next
return self._parser.parse(File.StringHandle(data))
File "/home/genome6/thomas/cbs/python/biopython/Bio/Blast/NCBIStandalone.py",
line 463, in parse
self._scanner.feed(handle, self._consumer)
File "/home/genome6/thomas/cbs/python/biopython/Bio/Blast/NCBIStandalone.py",
line 68, in feed
self._scan_rounds(uhandle, consumer)
File "/home/genome6/thomas/cbs/python/biopython/Bio/Blast/NCBIStandalone.py",
line 121, in _scan_rounds
self._scan_alignments(uhandle, consumer)
File "/home/genome6/thomas/cbs/python/biopython/Bio/Blast/NCBIStandalone.py",
line 226, in _scan_alignments
self._scan_pairwise_alignments(uhandle, consumer)
File "/home/genome6/thomas/cbs/python/biopython/Bio/Blast/NCBIStandalone.py",
line 236, in _scan_pairwise_alignments
self._scan_one_pairwise_alignment(uhandle, consumer)
File "/home/genome6/thomas/cbs/python/biopython/Bio/Blast/NCBIStandalone.py",
line 241, in _scan_one_pairwise_alignment
self._scan_alignment_header(uhandle, consumer)
File "/home/genome6/thomas/cbs/python/biopython/Bio/Blast/NCBIStandalone.py",
line 267, in _scan_alignment_header
read_and_call(uhandle, consumer.noevent, start=' ')
File "/home/genome6/thomas/cbs/python/biopython/Bio/ParserSupport.py", line
140, in read_and_call
raise SyntaxError, errmsg
SyntaxError: Line does not start with ' ':
--- SNIP --- SNIP -----
#blasttest.blastn
BLASTN 2.0.14 [Jun-29-2000]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Query= M15353
(100 letters)
Database: ensembl.cdna
37,720 sequences; 24,543,038 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value N
ENST00000044731 Gene:ENSG00000041402 Clone:AC060233 Cont... 182 4e-46 1
ENST00000041234 Gene:ENSG00000038511 Clone:AC015993 Cont... 163 3e-40 1
>ENST00000044731 Gene:ENSG00000041402 Clone:AC060233 Contig:AC060233.00036
Length = 654
Score = 182 bits (92), Expect = 4e-46
Identities = 98/100 (98%)
Strand = Plus / Plus
Query: 1 atggcgactgtcgaaccggaaaccacccctactcctaatcccccgactacagaagaggag 60
|||||||| ||||||||||||||||||||||||||||||||||||||||||||| |||||
Sbjct: 1 atggcgaccgtcgaaccggaaaccacccctactcctaatcccccgactacagaaaaggag 60
Query: 61 aaaacggaatctaatcaggaggttgctaacccagaacact 100
||||||||||||||||||||||||||||||||||||||||
Sbjct: 61 aaaacggaatctaatcaggaggttgctaacccagaacact 100
>ENST00000041234 Gene:ENSG00000038511 Clone:AC015993 Contig:AC015993.00011
Length = 361
Score = 163 bits (82), Expect = 3e-40
Identities = 82/82 (100%)
Strand = Plus / Plus
Query: 19 gaaaccacccctactcctaatcccccgactacagaagaggagaaaacggaatctaatcag 78
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1 gaaaccacccctactcctaatcccccgactacagaagaggagaaaacggaatctaatcag 60
Query: 79 gaggttgctaacccagaacact 100
||||||||||||||||||||||
Sbjct: 61 gaggttgctaacccagaacact 82
Database: ensembl.cdna
Posted date: Aug 3, 2000 1:07 PM
Number of letters in database: 24,543,038
Number of sequences in database: 37,720
Lambda K H
1.37 0.711 1.31
Matrix: blastn matrix:1 -3
Number of Hits to DB: 2
Number of Sequences: 37720
Number of extensions: 2
Number of successful extensions: 2
Number of sequences better than 10.0: 2
length of query: 100
length of database: 24,543,038
effective HSP length: 16
effective length of query: 84
effective length of database: 23,939,518
effective search space: 2010919512
effective search space used: 2010919512
T: 0
A: 0
X1: 6 (11.9 bits)
X2: 10 (19.8 bits)
S1: 12 (24.3 bits)
S2: 14 (28.2 bits)
BLASTN 2.0.14 [Jun-29-2000]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Query= X76013
(100 letters)
Database: ensembl.cdna
37,720 sequences; 24,543,038 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value N
ENST00000040999 Gene:ENSG00000038136 Clone:AC016581 Cont... 34 0.20 1
>ENST00000040999 Gene:ENSG00000038136 Clone:AC016581 Contig:AC016581.00002
Length = 438
Score = 34.2 bits (17), Expect = 0.20
Identities = 17/17 (100%)
Strand = Plus / Plus
Query: 38 tcggcctgagcgagcag 54
|||||||||||||||||
Sbjct: 29 tcggcctgagcgagcag 45
Database: ensembl.cdna
Posted date: Aug 3, 2000 1:07 PM
Number of letters in database: 24,543,038
Number of sequences in database: 37,720
Lambda K H
1.37 0.711 1.31
Matrix: blastn matrix:1 -3
Number of Hits to DB: 2
Number of Sequences: 37720
Number of extensions: 2
Number of successful extensions: 2
Number of sequences better than 10.0: 1
length of query: 100
length of database: 24,543,038
effective HSP length: 16
effective length of query: 84
effective length of database: 23,939,518
effective search space: 2010919512
effective search space used: 2010919512
T: 0
A: 0
X1: 6 (11.9 bits)
X2: 10 (19.8 bits)
S1: 12 (24.3 bits)
S2: 14 (28.2 bits)
BLASTN 2.0.14 [Jun-29-2000]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Query= U66617
(100 letters)
Database: ensembl.cdna
37,720 sequences; 24,543,038 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value N
ENST00000038861 Gene:ENSG00000036360 Clone:AC025361 Cont... 198 6e-51 1
ENST00000010117 Gene:ENSG00000007819 Clone:AL031228 Cont... 32 0.81 1
>ENST00000038861 Gene:ENSG00000036360 Clone:AC025361 Contig:AC025361.00005
Length = 605
Score = 198 bits (100), Expect = 6e-51
Identities = 100/100 (100%)
Strand = Plus / Plus
====> MESSAGE TRUNCATED AT 8192 <====
More information about the Biopython-dev
mailing list