[BioPython] Problems with NCBIXML.py
Bruno Santos
bsantos at biocant.pt
Tue Oct 23 15:59:50 UTC 2007
I am trying to build a simple script that given a multi FASTA sequence file
perform a web BLAST and replace the name of the sequence by the hit with the
lowest E-Value.
But now Im getting an exception that I dont now why its happening:
Traceback (most recent call last):
File
"C:\Python25\Lib\site-packages\pythonwin\pywin\framework\scriptutils.py",
line 310, in RunScript
exec codeObject in __main__.__dict__
File "C:\Documents and Settings\POSTO_21\Os meus documentos\Meta
Genómica\BLAST.py", line 16, in <module>
for blast_record in blast_records:
File "C:\Python25\lib\site-packages\Bio\Blast\NCBIXML.py", line 592, in
parse
expat_parser.Parse(text, False)
ExpatError: mismatched tag: line 2823, column 362
And where is my script:
from Bio import SeqIO
from Bio.Blast import NCBIWWW
import cStringIO
from Bio.Blast import NCBIXML
#for file in dir
file_handle =
open(r'C:/FASTASeq/Results/Well9/assembled_file_well9_Dt_DIST.fna') #Open
file to an handler
records = SeqIO.parse(file_handle, format="fasta") #Store the file in a Seq
Object
save_file = open(r'C:/FASTASeq/Results/Well9/D1_Blast.xml', "w")
for record in records:
sequence = record.seq.data #Converts record to Plain Text
result_handle = NCBIWWW.qblast("blastn", "nr", sequence) #Performs a
Blastn against the database nr
blast_results = result_handle.read() #Catch the results
save_file.write(blast_results) #Write all the information to an XML file
result_handle = open(r'C:/FASTASeq/Results/Well9/D1_Blast.xml')
blast_records = NCBIXML.parse(result_handle)
for blast_record in blast_records:
alignment = blast_record.alignments
nIdent =
(alignment[0].hsps[0].positives/float(alignment[0].hsps[0].align_length))*10
0.0
if nIdent >= 97:
record.name = alignment[0].hit_def
for record in records:
print('>description_%s length_%d\n' % (record.name, len(record.seq)))
print('%s\n' % record.seq)
save_file.close()
file_handle.close()
Thank you,
Bruno Santos
More information about the Biopython
mailing list