[Biopython-dev] Blast records

Michiel de Hoon mjldehoon at yahoo.com
Tue Sep 22 06:12:37 EDT 2009


Hi everybody,

I was looking at an older bug report about the plain-text and XML Blast parsers in Biopython:

http://bugzilla.open-bio.org/show_bug.cgi?id=2176

When I was checking the current behavior of Biopython's blast parsers, I noticed that the plain-text parser and the XML parser give different results when parsing psi-blast output. The plain-text parser returns a Blast.Record.PSIBlast object, whereas the XML parser returns Blast.Record.Blast objects. In addition, the XML parser misinterprets the psi-blast XML output (creating a separate Blast record for each psi-blast iteration), whereas the plain-text parser fails on psi-blast output of the current blast program.

To fix this, I guess the first step is to decide whether a psi-blast parser should return a Blast.Record.Blast object or a Blast.Record.PSIBlast object. In theory having a Blast.Record.PSIBlast record seems more appropriate. However, this complicates the parser (it's not clear until halfway through the Blast output if it's Blast or Psi-Blast, which means the user has to tell the parser whether it's Blast or Psi-Blast), and the format of the XML output generated for Blast and Psi-Blast is the same. I would therefore suggest to have one Blast.Record class that can contain both Blast and Psi-Blast output.

Any other opinions, comments, suggestions?

--Michiel.


      


More information about the Biopython-dev mailing list