[Biopython] Proteomics tools for biopython

Wed Feb 15 18:49:30 UTC 2012

Hi, 

I could possibly contribute a few snippets regarding an X!Tandem parser
and a few other little tools - at the moment I am very busy, but can
take part in the discussion more in March. 

Another interesting toolset that has been released today by Mike
Gorshkov's research group is pyteomics 1.0.0
http://pypi.python.org/pypi/pyteomics/1.0.0 

Best wishes, 
Achim

-----Original Message-----
From: biopython-bounces at lists.open-bio.org
[mailto:biopython-bounces at lists.open-bio.org] On Behalf Of David Martin
Sent: 15 February 2012 17:23
To: 'biopython at lists.open-bio.org'
Subject: [Biopython] Proteomics tools for biopython

> Hey David,
>
> What sort of tools do you have in mind for proteomics?  I have quite a
few stashed away (3/6 frame translations, GFF files->proteins, X!Tandem
parsers/FDR calculators, GFF parsers, etc.)
>
> Chris

At present we are wrapping the OpenMS outputs (featureML etc) so that we
can interrogate the detail of how the runs behave. It is insightful to
see (for example) how many of the ms/ms are on overlapping peptides, and
the distribution of ms/ms selections per feature (vs intensity). This is
just the first stage. Having these data (which up till now have been
difficult to access) allows for building of smarter tools (custom delta
mass thresholds for each ms/ms, second peptide searching, seeing whether
all the peptide ID for a feature agree, correlating ID from different
search engines to the same spectra).

There are outstanding questions from our users for things like 'is it
really necessary to do duplicate runs?' or in other words, can we get
the machine to treat duplicate runs differently to optimise ID. (under
the principle that madness is doing the same thing repeatedly but
expecting different results.)

Parsers for XTandem! would be really useful as that is something we'd
like to have in our tool chain. A Mascot one would be good - I am
looking into that (it is on my list of things to do, just not near the
top right now.)  I very much favour a modular approach where each
class/object does one thing really well and can feed output to another
class, and all can be represented using open formats. It might be a good
idea to arrange a telecon or Skype group chat for people who are
interested in contributing to this and building a comprehensive set of
tools into Biopython. I can't promise too much from our end but we are
making good progress and we have a strong commitment to open software
and algorithms, with a heavy python development presence.

..d

The University of Dundee is a registered Scottish Charity, No: SC015096

_______________________________________________
Biopython mailing list  -  Biopython at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython