[Biojava-dev] [Bug 2404] New: PSI-Blast flat file parsing for 2.2.17

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Tue Nov 20 11:38:11 UTC 2007


http://bugzilla.open-bio.org/show_bug.cgi?id=2404

           Summary: PSI-Blast flat file parsing for 2.2.17
           Product: BioJava
           Version: live (CVS source)
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: bio
        AssignedTo: biojava-dev at biojava.org
        ReportedBy: jimp at compbio.dundee.ac.uk


PSI Blast output is not properly supported. The attached file breaks the parser
(causes an infinite loop).

Andreas Prlic has outlined the necessary steps required for patching/enhancing
the exisiting parser:

* add PSI-Blast to BlastLikeVersionSupport
* add a PSIBlastSummaryLineHelper implements SumaryLineHelperIF
* hook it into BlastSAXParser in the IN_SUmmary section
* check for Sequences with E-value WORSE lines and ignore it (or do something
with them)
* check for hitSectionReached and call this method

everything up to here should be quickly possible. Some may be implementable as
independent filters (note added by me JBP). The thing which looks a bit scarier
at the moment is the next step:

* the hitSectionReach delegates the parsing to HitSectionSAXParser -
this needs to be extended to support psiblast   or an independent
implementation for Psi blast needs to be written...

finally:
* run the Junit tests to make sure  the other Blast file are still parsed
correctly
* write a Junit test to make sure other changes in the parser won't break the
support for this one


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the biojava-dev mailing list