[Bioperl-l] PSI-BLAST Matrix Parser?

James Thompson tex at biosysadmin.com
Tue Sep 7 21:19:54 EDT 2004


Stefan,

Thanks for the response. For reading in the actual alignment I would use
Bio::AlignIO to read the PSI-BLAST output as it's just another alignment file,
but the matrix file that I'm talking about is slightly different. Now that
I've perused CVS more and learned more about how the Bio::Matrix::PSM modules
work, I think I have a more clear picture of what I'd like to do. 

If you run PSI-BLAST with the -Q option, will take the matrix that it
used for the position-specific search and output it to a file. I've put up a
link to one of my matrix files up here if you'd like to look at it:

http://bioinformatics.rit.edu/~tex/atp1.matrix

Basically I'd like to make some Bio::Matrix::PSM::Psm objects (or at least
a PsmI-compliant object), and I think that the correct way to do this would
be to add a file format parser to Bio::Matrix::PSM::IO. Currently in Bioperl
there are three format parsers:
   - mast
   - meme
   - transfac

None of these work with the PSI-BLAST matrix files.  I'd like to write a new
matrix file parser (perhaps called psi-blast?) in the spirit of the three other
parsers.

If I were to write this, could someone commit it for me? 

James Thompson

On Tue, 7 Sep 2004, Stefan A Kirov wrote:

> I am not sure what object you are going to store your data in... Are you
> going to develop your own class to hold the data or use an existing one?
> Also is there any reason not to use Bio::AlignIO (it reads PSI-Blast as
> far as I know)?
> Stefan
> 
> 
> On Tue, 7 Sep 2004, James Thompson wrote:
> 
> >Dear Bioperl-ers,
> >
> >I'd like to parse the output of a PSI-BLAST matrix, and I was wondering if
> >there was a Bioperl way of parsing these files. If not, I'd like to make my
> >code general enough to be committed, and I'd like some advice on where exactly
> >to put such a module. From my cursory knowledge of Bioperl, I think that adding
> >another format parser to Bio::Matrix::PSM::IO would be a good way to go.
> >
> >I have a couple of questions:
> >- Does anyone know what the PSI-BLAST matrix format is called?
> >- Is this the correct place in which to put code for parsing this type of files?
> >
> >The file format represents a position-specific scoring matrix with some added
> >statistical information, here's a general overview of the information available
> >from the matrix file:
> >
> >Last position-specific scoring matrix computed, weighted observed percentages
> >rounded down, information per position, and relative weight of gapless real
> >matches to p seudocounts.
> >
> >Any help is greatly appreciated.
> >
> >James Thompson
> >
> >_______________________________________________
> >Bioperl-l mailing list
> >Bioperl-l at portal.open-bio.org
> >http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> 




More information about the Bioperl-l mailing list