[Biopython-dev] Interested in a Phenotype Microarray parser?

Peter Cock p.j.a.cock at googlemail.com
Thu Aug 20 10:12:42 UTC 2015


Great. Marco, are you happy to make a pull request on GitHub?

I'm willing to look over the code for general issues & merge it.

Peter

On Thu, Aug 20, 2015 at 2:10 AM, Michiel de Hoon <mjldehoon at yahoo.com> wrote:
> Looks good to me. I think we should include it.
> Best,
> -Michiel.
>
> --------------------------------------------
> On Wed, 8/19/15, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>
>  Subject: Re: [Biopython-dev] Interested in a Phenotype Microarray parser?
>  To: "Marco Galardini" <marco.galardini at unifi.it>, "Michiel de Hoon" <mjldehoon at yahoo.com>
>  Cc: "Biopython-Dev Mailing List" <biopython-dev at lists.open-bio.org>
>  Date: Wednesday, August 19, 2015, 11:12 PM
>
>  What do you think
>  Michiel? You're probably the most appropriate person
>  to look at this.
>
>  (I've never used Phenotype Microarray
>  plates)
>
>  Or anyone else?
>
>  Peter
>
>  On
>  Fri, Aug 14, 2015 at 3:13 PM, Marco Galardini
>  <marco.galardini at unifi.it>
>  wrote:
>  >
>  > Hi all,
>  >
>  > I just realised that
>  more than one year ago I made this pretty simple
>  > Phenotype Microarray data parser for
>  BioPython
>  > (https://github.com/mgalardini/biopython/tree/phenomics).
>  > Are you still interested in including it?
>  In any case I was thinking of
>  > releasing
>  it as a stand-alone library.
>  >
>  > Any suggestion or feedback is very
>  welcome.
>  >
>  > Best,
>  > Marco
>  >
>  >
>  > Il Martedì
>  20/05/2014 23:26 Marco Galardini ha scritto:
>  >>
>  >> Hi all,
>  >>
>  >> maybe not the
>  best moment, as you are busy releasing the new version:
>  >> just wanted to inform you that the
>  Bio.Phenotype module now supports
>  >>
>  sigmoid curve fitting and parameters extraction, only if
>  scipy is
>  >> installed.
>  >>
>  >> The latest
>  commits are here:
>  >> https://github.com/mgalardini/biopython/tree/phenomics
>  >>
>  >> The tests are
>  working for either python2 and python3.
>  >>
>  >> Hope this is
>  of interesting,
>  >> Marco
>  >>
>  >> -----
>  Messaggio da marco.galardini at unifi.it
>  ---------
>  >>     Data:
>  Wed, 16 Apr 2014 19:11:25 +0200
>  >>
>       Da: Marco Galardini <marco.galardini at unifi.it>
>  >> Rispondi-A:Marco Galardini <marco.galardini at unifi.it>
>  >>  Oggetto: Re: [Biopython-dev]
>  Interested in a Phenotype Microarray parser?
>  >>        A: Marco Galardini <marco.galardini at unifi.it>
>  >>       Cc: Peter Cock
>  <p.j.a.cock at googlemail.com>,
>  Biopython-Dev
>  >> Mailing List <biopython-dev at lists.open-bio.org>
>  >>
>  >>
>  >>> Hi,
>  >>>
>  >>>
>  regarding further additions to the Bio.Phenotype module I
>  was
>  >>> considering the following
>  solution to add support for sigmoid curve
>  >>> fitting and parameters extraction
>  (which is of interest when analysing
>  >>> this kind of data). Since the
>  easiest way to do the curve fitting is by
>  >>> using the scipy package, a
>  solution may be to implement it as a
>  >>> "optional" feature, like
>  the ability to draw trees with the Phylo
>  >>> module using matplotlib. An
>  exception would be raised if the "function"
>  >>> is called with no scipy
>  installed.
>  >>>
>  >>> Would it that be ok? Alternatively
>  some other way to perform curve
>  >>>
>  fitting may be found, but the maintenance may become very
>  difficult.
>  >>>
>  >>> Marco
>  >>>
>  >>> -----
>  Messaggio da marco.galardini at unifi.it
>  ---------
>  >>>     Data:
>  Tue, 01 Apr 2014 00:59:32 +0100
>  >>>       Da: Marco
>  Galardini <marco.galardini at unifi.it>
>  >>> Rispondi-A:Marco Galardini <marco.galardini at unifi.it>
>  >>>  Oggetto: Re: [Biopython-dev]
>  Interested in a Phenotype Microarray
>  >>> parser?
>  >>>        A: Peter Cock <p.j.a.cock at googlemail.com>
>  >>>       Cc:
>  Biopython-Dev Mailing List <biopython-dev at lists.open-bio.org>
>  >>>
>  >>>
>  >>>> Hi,
>  >>>>
>  >>>> as suggested, I've made a
>  few changes to the proposed Bio.Phenotype
>  >>>> module (apart from the
>  less-omics name).
>  >>>>
>  >>>> The PlateRecord object now can
>  be indexed in a similar fashion as
>  >>>> AlignIO multiple alignments:
>  it is still possible to use the WellRecord
>  >>>> identifier as an index, but
>  when integers or slices are used, new
>  >>>> sub-plates or single wells are
>  returned. The system uses the well
>  >>>> identifier as a mean to divide
>  the plate into rows/column. Thanks for
>  >>>> pointing out the AlignIO
>  system, it has been very useful.
>  >>>> I've left the two
>  getColumns and getRows functions, since for some
>  >>>> people it may still be useful
>  to use the wells identifiers. If you feel
>  >>>> like they are too confusing I
>  can remove them.
>  >>>>
>  >>>> The updated branch is here:
>  >>>> https://github.com/mgalardini/biopython/tree/phenomics
>  >>>>
>  >>>> Kind regards,
>  >>>> Marco
>  >>>>
>  >>>>
>  >>>> On 26/03/2014 13:26, Marco
>  Galardini wrote:
>  >>>>>
>  >>>>> Hi,
>  >>>>>
>  >>>>> many thanks for your
>  comments, below some replies:
>  >>>>>
>  >>>>> ----- Messaggio da p.j.a.cock at googlemail.com
>  ---------
>  >>>>>   Data: Wed, 26
>  Mar 2014 10:14:53 +0000
>  >>>>>     Da: Peter
>  Cock <p.j.a.cock at googlemail.com>
>  >>>>> Rispondi-A:Peter Cock
>  <p.j.a.cock at googlemail.com>
>  >>>>> Oggetto: Re:
>  [Biopython-dev] Interested in a Phenotype Microarray
>  >>>>> parser?
>  >>>>>      A: Marco Galardini
>  <marco.galardini at unifi.it>
>  >>>>>     Cc:
>  Biopython-Dev Mailing List <biopython-dev at lists.open-bio.org>
>  >>>>>
>  >>>>>
>  >>>>>> On Tue, Mar 25, 2014
>  at 11:40 PM, Marco Galardini
>  >>>>>> <marco.galardini at unifi.it>
>  wrote:
>  >>>>>>>
>  >>>>>>> Hi all,
>  >>>>>>>
>  >>>>>>> following your
>  suggestions (as well as the other modules
>  >>>>>>>
>  implementations)
>  >>>>>>> I've just
>  committed a couple of commits to my biopython fork,
>  >>>>>>> featuring the
>  >>>>>>> Bio.Phenomics
>  module.
>  >>>>>>> The
>  module capabilities are limited to reading/writing
>  Phenotype
>  >>>>>>>
>  Microarray
>  >>>>>>>
>  files and basic operations on the
>  PlateRecord/WellRecord   objects.
>  >>>>>>> The module
>  >>>>>>> requires numpy to
>  interpolate the signal when the user request a
>  >>>>>>> time point
>  >>>>>>> that wasn't in
>  the input file (this way the WellRecord
>  object   can
>  >>>>>>> be queried
>  >>>>>>> with slices).
>  >>>>>>> I'm thinking
>  on how to implement the parameters extraction from
>  >>>>>>> WellRecord
>  >>>>>>> objects without
>  the use of scipy.
>  >>>>>>>
>  >>>>>>> Here's the
>  link to my branch:
>  >>>>>>> https://github.com/mgalardini/biopython/tree/phenomics
>  >>>>>>> The module and
>  functions have been documented taking inspiration from
>  >>>>>>> the
>  >>>>>>> other modules:
>  hope they are clear enough for you to try it out.
>  >>>>>>> Some example files
>  can be found in Tests/Phenomics.
>  >>>>>>>
>  >>>>>>> Marco
>  >>>>>>
>  >>>>>>
>  >>>>>> Hi Marco,
>  >>>>>>
>  >>>>>> I've not worked
>  with kind of data so my comments are not on
>  >>>>>> the application
>  specifics. But I'm pleased to see unit tests :)
>  >>>>>>
>  >>>>>> One thought was while
>  you define (Java like?) getRow and getColumn
>  >>>>>> methods, your
>  __getitem__ does not support (NumPy like) access,
>  >>>>>> which is something we
>  do for multiple sequence alignments. I guess
>  >>>>>> while most plates are
>  laid out in a grid, the row/column for each
>  >>>>>> sample is not the most
>  important thing - the sample identifier is?
>  >>>>>>
>  >>>>>> Thinking out loud,
>  would properties `rows` and `columns` etc be
>  >>>>>> nicer than `getRow`
>  and `getColumn`, supporting iteration over
>  >>>>>> the rows/columns/etc
>  and indexing?
>  >>>>>
>  >>>>>
>  >>>>> Yeah, absolutely: I'll
>  work on some changes to have a more
>  >>>>> straightforward way to
>  select multiple WellRecords on row/column    basis.
>  >>>>>
>  >>>>>>
>  >>>>>> Minor: Your longer
>  function docstrings do not follow PEP257,
>  >>>>>> specifically starting
>  with a one line summary, then a blank line,
>  >>>>>> then the details. Also
>  you are using triple single-quotes, rather
>  >>>>>> than triple
>  double-quotes (like the rest of Biopthon).
>  >>>>>> http://legacy.python.org/dev/peps/pep-0257/
>  >>>>>
>  >>>>>
>  >>>>> Whoops, I'll change
>  it, thanks
>  >>>>>
>  >>>>>>
>  >>>>>> Peter
>  >>>>>>
>  >>>>>> P.S. Also, I'm not
>  very keen on the module name, phenomics -
>  >>>>>> I wonder if it would
>  earn Biopython a badomics award? ;)
>  >>>>>> http://dx.doi.org/10.1186/2047-217X-1-6
>  >>>>>
>  >>>>>
>  >>>>> That's meta-omics
>  right? :p
>  >>>>> What about
>  'Phenotype' then? Maybe it's too general, but
>  future
>  >>>>> extensions may
>  include other phenotypic readouts.
>  >>>>>
>  >>>>> Marco
>  >>>>>>
>  >>>>>>
>  _______________________________________________
>  >>>>>> Biopython-dev mailing
>  list
>  >>>>>> Biopython-dev at lists.open-bio.org
>  >>>>>> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>  >>>>>>
>  >>>>>
>  >>>>>
>  >>>>> ----- Fine del messaggio
>  da p.j.a.cock at googlemail.com
>  -----
>  >>>>>
>  >>>>>
>  >>>>>
>  >>>>> Marco Galardini
>  >>>>> Postdoctoral Fellow
>  >>>>> EMBL-EBI - European
>  Bioinformatics Institute
>  >>>>> Wellcome Trust Genome
>  Campus
>  >>>>> Hinxton,
>  Cambridge CB10 1SD, UK
>  >>>>>
>  Phone: +44 (0)1223 49 2547
>  >>>>>
>  >>>>>
>  >>>>>
>  _______________________________________________
>  >>>>> Biopython-dev mailing
>  list
>  >>>>> Biopython-dev at lists.open-bio.org
>  >>>>> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>  >>>>
>  >>>>
>  >>>> --
>  -------------------------------------------------
>  >>>> Marco Galardini, PhD
>  >>>> Dipartimento di Biologia
>  >>>> Via Madonna del Piano, 6 -
>  50019 Sesto Fiorentino (FI)
>  >>>>
>  >>>> e-mail: marco.galardini at unifi.it
>  >>>> www: http://www.unifi.it/dblage/CMpro-v-p-51.html
>  >>>> phone:  +39 055 4574737
>  >>>> mobile: +39 340 2808041
>  >>>>
>  -------------------------------------------------
>  >>>>
>  >>>>
>  _______________________________________________
>  >>>> Biopython-dev mailing list
>  >>>> Biopython-dev at lists.open-bio.org
>  >>>> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>  >>>
>  >>>
>  >>>
>  >>> -----
>  Fine del messaggio da marco.galardini at unifi.it
>  -----
>  >>>
>  >>>
>  >>>
>  >>> Marco Galardini
>  >>> Postdoctoral Fellow
>  >>> EMBL-EBI - European Bioinformatics
>  Institute
>  >>> Wellcome Trust Genome
>  Campus
>  >>> Hinxton, Cambridge CB10
>  1SD, UK
>  >>> Phone: +44 (0)1223 49
>  2547
>  >>
>  >>
>  >>
>  >> ----- Fine
>  del messaggio da marco.galardini at unifi.it
>  -----
>  >>
>  >>
>  >>
>  >> Marco
>  Galardini
>  >> Postdoctoral Fellow
>  >> EMBL-EBI - European Bioinformatics
>  Institute
>  >> Wellcome Trust Genome
>  Campus
>  >> Hinxton, Cambridge CB10 1SD,
>  UK
>  >> Phone: +44 (0)1223 49
>  2547



More information about the Biopython-dev mailing list