[BioPython] blast parsing errors
Peter
biopython at maubp.freeserve.co.uk
Mon Mar 5 15:12:25 UTC 2007
Julius Lucks wrote:
> Hi all,
>
> I am trying to parse a bunch of blast results that I gather via
> NCBIWWW.qblast(). I have the following code snipit:
I am wondering if your trivial example triggered some "unusual" error
page from the NCBI...
I would suggest you update to CVS, as we have made a lot of changes to
the Blast XML support. You would probably be safe just updating the
following Bio.Blast files, located here on your machine:
/sw/lib/python2.5/site-packages/Bio/Blast/NCBIStandalone.py
/sw/lib/python2.5/site-packages/Bio/Blast/NCBIWWW.py
/sw/lib/python2.5/site-packages/Bio/Blast/NCBIXML.py
/sw/lib/python2.5/site-packages/Bio/Blast/Record.py
If you don't know how to use CVS, then just backup the originals, and
replace them with the new files download one by one from here:
http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/Blast/?cvsroot=biopython
----------------------------------------------------------------------
This works for me using the CVS version of BioPython. I have just made
a string for rather than messing about with a fasta record object to
keep the code short:
#Protein example, BLASTP
from Bio.Blast import NCBIWWW
from Bio.Blast import NCBIXML
#BLAST cutoff
cutoff = 1e-4
fasta_rec = ">GI:121308427\nrslgmevmhernahnfpldlaavevpsing"
b_parser = NCBIXML.BlastParser()
result_handle = NCBIWWW.qblast('blastp', 'nr', fasta_rec, ncbi_gi=1,
expect=cutoff, format_type="XML",
entrez_query="Viruses [ORGN]")
#This returns a record iterator, changed after release of BioPython 1.42
b_records = b_parser.parse(result_handle)
for b_record in b_records :
print "%s found %i results" % (b_record.query,
len(b_record.alignments))
for alignment in b_record.alignments:
titles = alignment.title.split('>')
print titles
Or, if you wanted to do a nucleotide BLASTN search, try:
fasta_rec = '>GI:121308427\nttagccatttatagatggaacttcaacagcagctaagtc' \
+ 'tagagggaaattgtgagcattacgctcgtgcatgacctccataccaagagatct'
and replace 'blastp' with 'blastn' in the call to qblast().
Peter
More information about the Biopython
mailing list