[Biopython-dev] Interested in a Phenotype Microarray parser?

Marco Galardini marco.galardini at unifi.it
Thu Aug 20 11:09:42 UTC 2015


Hi Peter,

I've updated the previous pull request: 
https://github.com/biopython/biopython/pull/598

Thanks a lot for your support, please let me know if I can be of any 
help in writing a small tutorial/documentation.

Best,
Marco

Il Giovedì 20/08/2015 12:12 Peter Cock ha scritto:
> Great. Marco, are you happy to make a pull request on GitHub?
> 
> I'm willing to look over the code for general issues & merge it.
> 
> Peter
> 
> On Thu, Aug 20, 2015 at 2:10 AM, Michiel de Hoon <mjldehoon at yahoo.com> 
> wrote:
>> Looks good to me. I think we should include it.
>> Best,
>> -Michiel.
>> 
>> --------------------------------------------
>> On Wed, 8/19/15, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> 
>>  Subject: Re: [Biopython-dev] Interested in a Phenotype Microarray 
>> parser?
>>  To: "Marco Galardini" <marco.galardini at unifi.it>, "Michiel de Hoon" 
>> <mjldehoon at yahoo.com>
>>  Cc: "Biopython-Dev Mailing List" <biopython-dev at lists.open-bio.org>
>>  Date: Wednesday, August 19, 2015, 11:12 PM
>> 
>>  What do you think
>>  Michiel? You're probably the most appropriate person
>>  to look at this.
>> 
>>  (I've never used Phenotype Microarray
>>  plates)
>> 
>>  Or anyone else?
>> 
>>  Peter
>> 
>>  On
>>  Fri, Aug 14, 2015 at 3:13 PM, Marco Galardini
>>  <marco.galardini at unifi.it>
>>  wrote:
>>  >
>>  > Hi all,
>>  >
>>  > I just realised that
>>  more than one year ago I made this pretty simple
>>  > Phenotype Microarray data parser for
>>  BioPython
>>  > (https://github.com/mgalardini/biopython/tree/phenomics).
>>  > Are you still interested in including it?
>>  In any case I was thinking of
>>  > releasing
>>  it as a stand-alone library.
>>  >
>>  > Any suggestion or feedback is very
>>  welcome.
>>  >
>>  > Best,
>>  > Marco
>>  >
>>  >
>>  > Il Martedì
>>  20/05/2014 23:26 Marco Galardini ha scritto:
>>  >>
>>  >> Hi all,
>>  >>
>>  >> maybe not the
>>  best moment, as you are busy releasing the new version:
>>  >> just wanted to inform you that the
>>  Bio.Phenotype module now supports
>>  >>
>>  sigmoid curve fitting and parameters extraction, only if
>>  scipy is
>>  >> installed.
>>  >>
>>  >> The latest
>>  commits are here:
>>  >> https://github.com/mgalardini/biopython/tree/phenomics
>>  >>
>>  >> The tests are
>>  working for either python2 and python3.
>>  >>
>>  >> Hope this is
>>  of interesting,
>>  >> Marco
>>  >>
>>  >> -----
>>  Messaggio da marco.galardini at unifi.it
>>  ---------
>>  >>     Data:
>>  Wed, 16 Apr 2014 19:11:25 +0200
>>  >>
>>       Da: Marco Galardini <marco.galardini at unifi.it>
>>  >> Rispondi-A:Marco Galardini <marco.galardini at unifi.it>
>>  >>  Oggetto: Re: [Biopython-dev]
>>  Interested in a Phenotype Microarray parser?
>>  >>        A: Marco Galardini <marco.galardini at unifi.it>
>>  >>       Cc: Peter Cock
>>  <p.j.a.cock at googlemail.com>,
>>  Biopython-Dev
>>  >> Mailing List <biopython-dev at lists.open-bio.org>
>>  >>
>>  >>
>>  >>> Hi,
>>  >>>
>>  >>>
>>  regarding further additions to the Bio.Phenotype module I
>>  was
>>  >>> considering the following
>>  solution to add support for sigmoid curve
>>  >>> fitting and parameters extraction
>>  (which is of interest when analysing
>>  >>> this kind of data). Since the
>>  easiest way to do the curve fitting is by
>>  >>> using the scipy package, a
>>  solution may be to implement it as a
>>  >>> "optional" feature, like
>>  the ability to draw trees with the Phylo
>>  >>> module using matplotlib. An
>>  exception would be raised if the "function"
>>  >>> is called with no scipy
>>  installed.
>>  >>>
>>  >>> Would it that be ok? Alternatively
>>  some other way to perform curve
>>  >>>
>>  fitting may be found, but the maintenance may become very
>>  difficult.
>>  >>>
>>  >>> Marco
>>  >>>
>>  >>> -----
>>  Messaggio da marco.galardini at unifi.it
>>  ---------
>>  >>>     Data:
>>  Tue, 01 Apr 2014 00:59:32 +0100
>>  >>>       Da: Marco
>>  Galardini <marco.galardini at unifi.it>
>>  >>> Rispondi-A:Marco Galardini <marco.galardini at unifi.it>
>>  >>>  Oggetto: Re: [Biopython-dev]
>>  Interested in a Phenotype Microarray
>>  >>> parser?
>>  >>>        A: Peter Cock <p.j.a.cock at googlemail.com>
>>  >>>       Cc:
>>  Biopython-Dev Mailing List <biopython-dev at lists.open-bio.org>
>>  >>>
>>  >>>
>>  >>>> Hi,
>>  >>>>
>>  >>>> as suggested, I've made a
>>  few changes to the proposed Bio.Phenotype
>>  >>>> module (apart from the
>>  less-omics name).
>>  >>>>
>>  >>>> The PlateRecord object now can
>>  be indexed in a similar fashion as
>>  >>>> AlignIO multiple alignments:
>>  it is still possible to use the WellRecord
>>  >>>> identifier as an index, but
>>  when integers or slices are used, new
>>  >>>> sub-plates or single wells are
>>  returned. The system uses the well
>>  >>>> identifier as a mean to divide
>>  the plate into rows/column. Thanks for
>>  >>>> pointing out the AlignIO
>>  system, it has been very useful.
>>  >>>> I've left the two
>>  getColumns and getRows functions, since for some
>>  >>>> people it may still be useful
>>  to use the wells identifiers. If you feel
>>  >>>> like they are too confusing I
>>  can remove them.
>>  >>>>
>>  >>>> The updated branch is here:
>>  >>>> https://github.com/mgalardini/biopython/tree/phenomics
>>  >>>>
>>  >>>> Kind regards,
>>  >>>> Marco
>>  >>>>
>>  >>>>
>>  >>>> On 26/03/2014 13:26, Marco
>>  Galardini wrote:
>>  >>>>>
>>  >>>>> Hi,
>>  >>>>>
>>  >>>>> many thanks for your
>>  comments, below some replies:
>>  >>>>>
>>  >>>>> ----- Messaggio da p.j.a.cock at googlemail.com
>>  ---------
>>  >>>>>   Data: Wed, 26
>>  Mar 2014 10:14:53 +0000
>>  >>>>>     Da: Peter
>>  Cock <p.j.a.cock at googlemail.com>
>>  >>>>> Rispondi-A:Peter Cock
>>  <p.j.a.cock at googlemail.com>
>>  >>>>> Oggetto: Re:
>>  [Biopython-dev] Interested in a Phenotype Microarray
>>  >>>>> parser?
>>  >>>>>      A: Marco Galardini
>>  <marco.galardini at unifi.it>
>>  >>>>>     Cc:
>>  Biopython-Dev Mailing List <biopython-dev at lists.open-bio.org>
>>  >>>>>
>>  >>>>>
>>  >>>>>> On Tue, Mar 25, 2014
>>  at 11:40 PM, Marco Galardini
>>  >>>>>> <marco.galardini at unifi.it>
>>  wrote:
>>  >>>>>>>
>>  >>>>>>> Hi all,
>>  >>>>>>>
>>  >>>>>>> following your
>>  suggestions (as well as the other modules
>>  >>>>>>>
>>  implementations)
>>  >>>>>>> I've just
>>  committed a couple of commits to my biopython fork,
>>  >>>>>>> featuring the
>>  >>>>>>> Bio.Phenomics
>>  module.
>>  >>>>>>> The
>>  module capabilities are limited to reading/writing
>>  Phenotype
>>  >>>>>>>
>>  Microarray
>>  >>>>>>>
>>  files and basic operations on the
>>  PlateRecord/WellRecord   objects.
>>  >>>>>>> The module
>>  >>>>>>> requires numpy to
>>  interpolate the signal when the user request a
>>  >>>>>>> time point
>>  >>>>>>> that wasn't in
>>  the input file (this way the WellRecord
>>  object   can
>>  >>>>>>> be queried
>>  >>>>>>> with slices).
>>  >>>>>>> I'm thinking
>>  on how to implement the parameters extraction from
>>  >>>>>>> WellRecord
>>  >>>>>>> objects without
>>  the use of scipy.
>>  >>>>>>>
>>  >>>>>>> Here's the
>>  link to my branch:
>>  >>>>>>> https://github.com/mgalardini/biopython/tree/phenomics
>>  >>>>>>> The module and
>>  functions have been documented taking inspiration from
>>  >>>>>>> the
>>  >>>>>>> other modules:
>>  hope they are clear enough for you to try it out.
>>  >>>>>>> Some example files
>>  can be found in Tests/Phenomics.
>>  >>>>>>>
>>  >>>>>>> Marco
>>  >>>>>>
>>  >>>>>>
>>  >>>>>> Hi Marco,
>>  >>>>>>
>>  >>>>>> I've not worked
>>  with kind of data so my comments are not on
>>  >>>>>> the application
>>  specifics. But I'm pleased to see unit tests :)
>>  >>>>>>
>>  >>>>>> One thought was while
>>  you define (Java like?) getRow and getColumn
>>  >>>>>> methods, your
>>  __getitem__ does not support (NumPy like) access,
>>  >>>>>> which is something we
>>  do for multiple sequence alignments. I guess
>>  >>>>>> while most plates are
>>  laid out in a grid, the row/column for each
>>  >>>>>> sample is not the most
>>  important thing - the sample identifier is?
>>  >>>>>>
>>  >>>>>> Thinking out loud,
>>  would properties `rows` and `columns` etc be
>>  >>>>>> nicer than `getRow`
>>  and `getColumn`, supporting iteration over
>>  >>>>>> the rows/columns/etc
>>  and indexing?
>>  >>>>>
>>  >>>>>
>>  >>>>> Yeah, absolutely: I'll
>>  work on some changes to have a more
>>  >>>>> straightforward way to
>>  select multiple WellRecords on row/column    basis.
>>  >>>>>
>>  >>>>>>
>>  >>>>>> Minor: Your longer
>>  function docstrings do not follow PEP257,
>>  >>>>>> specifically starting
>>  with a one line summary, then a blank line,
>>  >>>>>> then the details. Also
>>  you are using triple single-quotes, rather
>>  >>>>>> than triple
>>  double-quotes (like the rest of Biopthon).
>>  >>>>>> http://legacy.python.org/dev/peps/pep-0257/
>>  >>>>>
>>  >>>>>
>>  >>>>> Whoops, I'll change
>>  it, thanks
>>  >>>>>
>>  >>>>>>
>>  >>>>>> Peter
>>  >>>>>>
>>  >>>>>> P.S. Also, I'm not
>>  very keen on the module name, phenomics -
>>  >>>>>> I wonder if it would
>>  earn Biopython a badomics award? ;)
>>  >>>>>> http://dx.doi.org/10.1186/2047-217X-1-6
>>  >>>>>
>>  >>>>>
>>  >>>>> That's meta-omics
>>  right? :p
>>  >>>>> What about
>>  'Phenotype' then? Maybe it's too general, but
>>  future
>>  >>>>> extensions may
>>  include other phenotypic readouts.
>>  >>>>>
>>  >>>>> Marco
>>  >>>>>>
>>  >>>>>>
>>  _______________________________________________
>>  >>>>>> Biopython-dev mailing
>>  list
>>  >>>>>> Biopython-dev at lists.open-bio.org
>>  >>>>>> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>>  >>>>>>
>>  >>>>>
>>  >>>>>
>>  >>>>> ----- Fine del messaggio
>>  da p.j.a.cock at googlemail.com
>>  -----
>>  >>>>>
>>  >>>>>
>>  >>>>>
>>  >>>>> Marco Galardini
>>  >>>>> Postdoctoral Fellow
>>  >>>>> EMBL-EBI - European
>>  Bioinformatics Institute
>>  >>>>> Wellcome Trust Genome
>>  Campus
>>  >>>>> Hinxton,
>>  Cambridge CB10 1SD, UK
>>  >>>>>
>>  Phone: +44 (0)1223 49 2547
>>  >>>>>
>>  >>>>>
>>  >>>>>
>>  _______________________________________________
>>  >>>>> Biopython-dev mailing
>>  list
>>  >>>>> Biopython-dev at lists.open-bio.org
>>  >>>>> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>>  >>>>
>>  >>>>
>>  >>>> --
>>  -------------------------------------------------
>>  >>>> Marco Galardini, PhD
>>  >>>> Dipartimento di Biologia
>>  >>>> Via Madonna del Piano, 6 -
>>  50019 Sesto Fiorentino (FI)
>>  >>>>
>>  >>>> e-mail: marco.galardini at unifi.it
>>  >>>> www: http://www.unifi.it/dblage/CMpro-v-p-51.html
>>  >>>> phone:  +39 055 4574737
>>  >>>> mobile: +39 340 2808041
>>  >>>>
>>  -------------------------------------------------
>>  >>>>
>>  >>>>
>>  _______________________________________________
>>  >>>> Biopython-dev mailing list
>>  >>>> Biopython-dev at lists.open-bio.org
>>  >>>> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>>  >>>
>>  >>>
>>  >>>
>>  >>> -----
>>  Fine del messaggio da marco.galardini at unifi.it
>>  -----
>>  >>>
>>  >>>
>>  >>>
>>  >>> Marco Galardini
>>  >>> Postdoctoral Fellow
>>  >>> EMBL-EBI - European Bioinformatics
>>  Institute
>>  >>> Wellcome Trust Genome
>>  Campus
>>  >>> Hinxton, Cambridge CB10
>>  1SD, UK
>>  >>> Phone: +44 (0)1223 49
>>  2547
>>  >>
>>  >>
>>  >>
>>  >> ----- Fine
>>  del messaggio da marco.galardini at unifi.it
>>  -----
>>  >>
>>  >>
>>  >>
>>  >> Marco
>>  Galardini
>>  >> Postdoctoral Fellow
>>  >> EMBL-EBI - European Bioinformatics
>>  Institute
>>  >> Wellcome Trust Genome
>>  Campus
>>  >> Hinxton, Cambridge CB10 1SD,
>>  UK
>>  >> Phone: +44 (0)1223 49
>>  2547
> 
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biopython-dev


More information about the Biopython-dev mailing list