[Bioperl-l] protein identification program for proteomics

Peter J Ulintz pulintz@umich.edu
Thu, 14 Sep 2000 08:30:13 -0400 (EDT)


There are other packages that have been written that do similar,
MOWSE-type things, such as Protein Prospector
(http://prospector.ucsf.edu/) and PeptideSearch
(http://www.mann.embl-heidelberg.de/Services/PeptideSearch/PeptideSearchIntro.html).

These programs are also a bit more specific to the results of sequence
generated by MS/MS experiments, and are more geared to peptide mass
mapping data than simply N or C term sequence data.  Prospector has an
"MS-Edman" program that can be used without peptide masses, in which you
can enter a regular expression of the sequence and search for matches in
databases.  

In my experience you can usually get a pretty good match with a normal 10
residues of N-term Edman sequence and a MW, at least for smaller genomes.  
I'd say a routine could be written to do this using the existing objects
in BioPerl, although how well it would work would depend on the data.

--Pete


   

On Thu, 14 Sep 2000, Val Curwen wrote:

> 
> >>>> 
> >>>> I am looking for a stand alone protein identification program for
> >>>> proteomics written in Perl to run on my Linux box.  This program will
> >>>> identify proteins based on their molecular weight, and their N-terminal,
> >>>> or C-terminal sequence (a short one that you get from micro sequencing).
> >>>> The program will search flat files of amino acids sequences from the
> >>>> generic databases.
> 
> >>>
> >>>I think Alan Bleasbly once wrote something like this, so there might be
> >>>something in EMBOSS...
> 
> It isn't in perl, but this does sound rather like mowse (EMBOSS version 
> is emowse), that uses peptide mass fingerprints following protease 
> digestion to identify proteins - proteins can be identified by a 
> very few experimentally determined fragment masses. Whether a 
> single N/C terminal peptide with sequence information and whole 
> sequence MW will be sufficient I don't know - I've forwarded the 
> message on to Alan but he's in a meeting right now.
> 
> We have a mowse server at HGMP (you don't need to be registered to 
> use it):
> 
> http://www.hgmp.mrc.ac.uk/Bioinformatics/Webapp/mowse/
> 
> At the moment that only searches a database of fragments derived from
> OWL sequences. The version Alan has just checked into EMBOSS may well
> be able to search others - but I'll have to check with him. EMBOSS is 
> happy under Linux.
> 
> Hope this helps,
> 
> Val
> 
> Val Curwen
> Bioinformatics, MRC HGMP-Resource Centre, 
> Hinxton, Cambridge, CB10 1SB
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>