[Biopython-dev] Parsing RPS-Blast output with BioPython

Peter biopython-dev at maubp.freeserve.co.uk
Mon Dec 6 16:57:05 EST 2004


Peter wrote:

> Has anyone looked at using BioPython with the NCBI's (standalone) 
> RPS-Blast program?
> 
> RPSBLAST = Reverse Position Specific BLAST, used to query a protein 
> sequence against the Conserved Domain Database, Pfam, SMART etc.
> 
> I've had a little look at the NCBIStandalone.py, and can see how a 
> function rpsblast could be added, based on the existing blastall or 
> blastpgp functions.
> 
> (i.e. Make a copy of blastpgp and called it rpsblast)
> 
> However, it would appear that the output parser will need some 
> additional work to understand the RPS-BLAST output...

It looks like its not as much work as I had feared.  With Blast 
2.2.9 at least, the RPS-BLAST output is just a slightly reduced 
version of the BLASTP output.

[I know BLAST 2.2.10 has been released.  I should really update my 
machine]

I think I have got the existing code to work, by making the
parser aware that some sections of the "header" and "database 
report" sections are now "optional".

Bug logged and rough solution submitted:

http://bugzilla.open-bio.org/show_bug.cgi?id=1715

Please note that my changes have only received minimal testing, and
in particular I have only checked the classic Blast support still 
works for a simple blastp query.

I'm hoping that a BioPython developer will now volunteer to take a
look at this, make sure the style etc is acceptable, and hopefully 
merge it into CVS.  Do you guys have semi-automatic test 
scripts/unit tests?

Thanks

Peter
MOAC Doctoral Training Centre
University of Warwick, UK



More information about the Biopython-dev mailing list