[BioPython] Problems with the standalone blast parser.

humberto@hpcf.upr.edu humberto@hpcf.upr.edu
Fri, 05 Apr 2002 14:53:02 -0400


--==_Exmh_-1847052322P
Content-Type: text/plain; charset=us-ascii

> [Bad nucleotide input into BLAST causes errors and malformed BLAST
> output]

> I had this problem myself with bad input sequences (in my case I was
> just BLASTing a bunch of ESTs, some of which were junky), and added a
> BlastErrorParser class to NCBIStandalone which allows you to detect
> these errors, record them if necessary, and then skip on to the next
> record. 

Thanks Brad, this is exactly what I needed. I guess I missed the 
BlastErrorParser section the first time I read the cookbook.

Here's a full parsing script using this technique, for the cookbook.

# standard library
import sys

# biopython
from Bio.Blast import NCBIStandalone

my_blast_file = sys.argv[1]

def justparse(blastfile):

  blast_out = open(blastfile, "r")

  b_parser = NCBIStandalone.BlastErrorParser()

  b_iterator = NCBIStandalone.Iterator(blast_out, b_parser)

  while 1:
    try:
      b_record = b_iterator.next()
    except NCBIStandalone.LowQualityBlastError, info:
      print "LowQualityBlastError detected in id %s" % info[1]
    else:

      if b_record is None:
        break

      # Do whatever with the blast record
      if len(b_record.alignments) == 0:
	print "No hits found for %s" % b_record.query
      else:
        print "Hits found for", b_record.query

# Main
justparse(my_blast_file)


-- 
Humberto Ortiz Zuazaga
Programmer-Archaeologist
High Performance Computing facility
University of Puerto Rico
http://www.hpcf.upr.edu/~humberto/



--==_Exmh_-1847052322P
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Exmh version 2.3 01/15/2001

iD8DBQE8rfKOO8aX8Tqx8vgRAt7gAJoD4cps//zBWpEENNApG5BxHg0AjgCdEtWm
omuPjcAYwIDou9ERC6CrkWg=
=Pdl8
-----END PGP SIGNATURE-----

--==_Exmh_-1847052322P--