[BioPython] Need help parsing Blastoutput

Michiel De Hoon mdehoon at c2b2.columbia.edu
Wed Apr 19 16:54:33 UTC 2006


The Blast parser fails to read your file because the format of Blast output
has changed. If I edit the data file so that it corresponds to the old format
(add a space here, remove a blank line there, etc.), the Blast parser reads
the file without problems. The easiest solution is to repeat the Blast run,
using XML for the output format, and use the Blast XML parser in Biopython to
parse the results.

A general question is if anybody still needs the parser for Blast text
output. Currently, we are confusing our users by having a Blast text parser
that tends to break. A broken parser may be worse than no parser.

--Michiel.

Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



-----Original Message-----
From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
Sent: Wed 4/19/2006 6:15 AM
To: Michiel De Hoon
Cc: biopython at lists.open-bio.org
Subject: RE: [BioPython] Need help parsing Blastoutput
 
Hi 
Please see the attachment,it part of my Blast output.
yes I am try to parse text output from Blast ,I have use another script to 
run my local blast that I am trying to perse the NCBIStandalone.BlastParser 
was working fine without hsp.sbject_end  which is one of what I need to 
print out .
On checking the class diagrams from cookbook, findout that sbject_end is 
not included .I just need another way of printing the int(subject end).
Thanks for your help
Halimah

On Tue, 18 Apr 2006, Michiel De Hoon wrote:

> Could you also send us the file Enterococcus_out so we can run the script?
> 
> From the script, it looks like you're trying to parse text output from
Blast.
> While this is possible (in theory), the format of Blast text output tends
to
> change a lot, thereby breaking the parser in Biopython. It is more reliable
> to have Blast generate output in XML format, and use the XML parser:
> 
> blast_out = open('my_blast.xml', 'r')
> 
> from Bio.Blast import NCBIXML
> 
> b_parser = NCBIXML.BlastParser()
> b_record = b_parser.parse(blast_out)
> 
> See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to
> generate Blast output in XML.
> 
> --Michiel.
> 
> 
> 
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
> 
> 
> 
> -----Original Message-----
> From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> Sent: Tue 4/18/2006 11:06 AM
> To: Michiel De Hoon
> Cc: biopython at lists.open-bio.org
> Subject: RE: [BioPython] Need help parsing Blastoutput
>  
> thanks
> please see the attchment a copy of my script and copy of my Blast output
> Thanks
> 
> 
> On Thu, 13 Apr 2006, Michiel De Hoon wrote:
> 
> > Could you send us the script you were using?
> > 
> > --Michiel.
> > 
> > Michiel de Hoon
> > Center for Computational Biology and Bioinformatics
> > Columbia University
> > 1150 St Nicholas Avenue
> > New York, NY 10032
> > 
> > 
> > 
> > -----Original Message-----
> > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
> > Sent: Thu 4/13/2006 11:07 AM
> > To: biopython at lists.open-bio.org
> > Subject: [BioPython] Need help parsing Blastoutput
> >  
> > Hi All,
> > I have a BLAST output from a local blast
> > I need to calculate my % alignment coverage as regard to my subject
> > I try parsed the blast output and wanted to print the
> > sbjct Start and Sbjct end. but I could not is there anyway I could this 
> > try to get mach coverage between my querry and subject I dont need 
> > Identities,but total % alignment for querry or subject.
> > Thanks
> > Halimah
> > 
> > _______________________________________________
> > BioPython mailing list  -  BioPython at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biopython
> > 
> > 
> 
> 





More information about the Biopython mailing list