[BioPython] Need help parsing Blastoutput
Halima Rabiu
halima at mancala.cbio.uct.ac.za
Thu Apr 20 11:57:20 UTC 2006
thanks I try using XML parser and I am still geting errors which I dont
understand . please see the attchmnt copy of my script and Blast XML
output.
here is the error
raceback (most recent call last):
File "Bioperser.py", line 11, in ?
b_record = b_parser.parse(b_out)
File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line
112, in parse
self._parser.parse(handler)
File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in
parse
xmlreader.IncrementalParser.parse(self, source)
File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in
parse
self.feed(buffer)
File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in
feed
self._err_handler.fatalError(exc)
File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in
fatalError
raise exception
thanks
Halimah
On Wed, 19 Apr 2006, Michiel De Hoon wrote:
> The Blast parser fails to read your file because the format of Blast output
> has changed. If I edit the data file so that it corresponds to the old format
> (add a space here, remove a blank line there, etc.), the Blast parser reads
> the file without problems. The easiest solution is to repeat the Blast run,
> using XML for the output format, and use the Blast XML parser in Biopython to
> parse the results.
>
> A general question is if anybody still needs the parser for Blast text
> output. Currently, we are confusing our users by having a Blast text parser
> that tends to break. A broken parser may be worse than no parser.
>
> --Michiel.
>
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
>
>
>
> -----Original Message-----
> From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> Sent: Wed 4/19/2006 6:15 AM
> To: Michiel De Hoon
> Cc: biopython at lists.open-bio.org
> Subject: RE: [BioPython] Need help parsing Blastoutput
>
> Hi
> Please see the attachment,it part of my Blast output.
> yes I am try to parse text output from Blast ,I have use another script to
> run my local blast that I am trying to perse the NCBIStandalone.BlastParser
> was working fine without hsp.sbject_end which is one of what I need to
> print out .
> On checking the class diagrams from cookbook, findout that sbject_end is
> not included .I just need another way of printing the int(subject end).
> Thanks for your help
> Halimah
>
> On Tue, 18 Apr 2006, Michiel De Hoon wrote:
>
> > Could you also send us the file Enterococcus_out so we can run the script?
> >
> > From the script, it looks like you're trying to parse text output from
> Blast.
> > While this is possible (in theory), the format of Blast text output tends
> to
> > change a lot, thereby breaking the parser in Biopython. It is more reliable
> > to have Blast generate output in XML format, and use the XML parser:
> >
> > blast_out = open('my_blast.xml', 'r')
> >
> > from Bio.Blast import NCBIXML
> >
> > b_parser = NCBIXML.BlastParser()
> > b_record = b_parser.parse(blast_out)
> >
> > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to
> > generate Blast output in XML.
> >
> > --Michiel.
> >
> >
> >
> > Michiel de Hoon
> > Center for Computational Biology and Bioinformatics
> > Columbia University
> > 1150 St Nicholas Avenue
> > New York, NY 10032
> >
> >
> >
> > -----Original Message-----
> > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> > Sent: Tue 4/18/2006 11:06 AM
> > To: Michiel De Hoon
> > Cc: biopython at lists.open-bio.org
> > Subject: RE: [BioPython] Need help parsing Blastoutput
> >
> > thanks
> > please see the attchment a copy of my script and copy of my Blast output
> > Thanks
> >
> >
> > On Thu, 13 Apr 2006, Michiel De Hoon wrote:
> >
> > > Could you send us the script you were using?
> > >
> > > --Michiel.
> > >
> > > Michiel de Hoon
> > > Center for Computational Biology and Bioinformatics
> > > Columbia University
> > > 1150 St Nicholas Avenue
> > > New York, NY 10032
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
> > > Sent: Thu 4/13/2006 11:07 AM
> > > To: biopython at lists.open-bio.org
> > > Subject: [BioPython] Need help parsing Blastoutput
> > >
> > > Hi All,
> > > I have a BLAST output from a local blast
> > > I need to calculate my % alignment coverage as regard to my subject
> > > I try parsed the blast output and wanted to print the
> > > sbjct Start and Sbjct end. but I could not is there anyway I could this
> > > try to get mach coverage between my querry and subject I dont need
> > > Identities,but total % alignment for querry or subject.
> > > Thanks
> > > Halimah
> > >
> > > _______________________________________________
> > > BioPython mailing list - BioPython at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/biopython
> > >
> > >
> >
> >
>
>
-------------- next part --------------
#! /usr/local/bin/python2.4
#halimah
#16-04-2006
from string import split
from Bio.Blast import NCBIXML
#from Bio.Blast import NCBIStandalone
b_out = open('blast2.xml','r')
b_parser = NCBIXML.BlastParser()
b_record = b_parser.parse(b_out)
E_VALUE_THRESH = 1.0
while 1:
b_record = b_iterator.next()
print "The following results are for query " + b_record.query
print 'len of query:',b_record.query_letters
if b_record is None:
break
for alignment in b_record.alignments:
for hsp in alignment.hsps:
if hsp.expect <= E_VALUE_THRESH:
print '****Alignment****'
print 'title:', alignment.title
print 'length:', alignment.length
print 'e value:', hsp.expect
print 'subjectstart:',hsp.sbjct_start
print 'subject end:', hsp.sbject_end
-------------- next part --------------
A non-text attachment was scrubbed...
Name: blast2.xml
Type: text/xml
Size: 151659 bytes
Desc:
URL: <http://lists.open-bio.org/pipermail/biopython/attachments/20060420/391af520/attachment-0002.xml>
More information about the Biopython
mailing list