[Biopython-dev] Interested in a Phenotype Microarray parser?

Marco Galardini marco.galardini at unifi.it
Wed Apr 16 13:11:25 EDT 2014


Hi,

regarding further additions to the Bio.Phenotype module I was  
considering the following solution to add support for sigmoid curve  
fitting and parameters extraction (which is of interest when analysing  
this kind of data). Since the easiest way to do the curve fitting is  
by using the scipy package, a solution may be to implement it as a  
"optional" feature, like the ability to draw trees with the Phylo  
module using matplotlib. An exception would be raised if the  
"function" is called with no scipy installed.

Would it that be ok? Alternatively some other way to perform curve  
fitting may be found, but the maintenance may become very difficult.

Marco

----- Messaggio da marco.galardini at unifi.it ---------
     Data: Tue, 01 Apr 2014 00:59:32 +0100
       Da: Marco Galardini <marco.galardini at unifi.it>
Rispondi-A:Marco Galardini <marco.galardini at unifi.it>
  Oggetto: Re: [Biopython-dev] Interested in a Phenotype Microarray parser?
        A: Peter Cock <p.j.a.cock at googlemail.com>
       Cc: Biopython-Dev Mailing List <biopython-dev at lists.open-bio.org>


> Hi,
>
> as suggested, I've made a few changes to the proposed Bio.Phenotype
> module (apart from the less-omics name).
>
> The PlateRecord object now can be indexed in a similar fashion as
> AlignIO multiple alignments: it is still possible to use the WellRecord
> identifier as an index, but when integers or slices are used, new
> sub-plates or single wells are returned. The system uses the well
> identifier as a mean to divide the plate into rows/column. Thanks for
> pointing out the AlignIO system, it has been very useful.
> I've left the two getColumns and getRows functions, since for some
> people it may still be useful to use the wells identifiers. If you feel
> like they are too confusing I can remove them.
>
> The updated branch is here:
> https://github.com/mgalardini/biopython/tree/phenomics
>
> Kind regards,
> Marco
>
>
> On 26/03/2014 13:26, Marco Galardini wrote:
>> Hi,
>>
>> many thanks for your comments, below some replies:
>>
>> ----- Messaggio da p.j.a.cock at googlemail.com ---------
>>    Data: Wed, 26 Mar 2014 10:14:53 +0000
>>      Da: Peter Cock <p.j.a.cock at googlemail.com>
>> Rispondi-A:Peter Cock <p.j.a.cock at googlemail.com>
>> Oggetto: Re: [Biopython-dev] Interested in a Phenotype Microarray parser?
>>       A: Marco Galardini <marco.galardini at unifi.it>
>>      Cc: Biopython-Dev Mailing List <biopython-dev at lists.open-bio.org>
>>
>>
>>> On Tue, Mar 25, 2014 at 11:40 PM, Marco Galardini
>>> <marco.galardini at unifi.it> wrote:
>>>> Hi all,
>>>>
>>>> following your suggestions (as well as the other modules implementations)
>>>> I've just committed a couple of commits to my biopython fork,   
>>>> featuring the
>>>> Bio.Phenomics module.
>>>> The module capabilities are limited to reading/writing Phenotype   
>>>> Microarray
>>>> files and basic operations on the PlateRecord/WellRecord objects.  
>>>>  The module
>>>> requires numpy to interpolate the signal when the user request a   
>>>> time point
>>>> that wasn't in the input file (this way the WellRecord object can  
>>>>  be queried
>>>> with slices).
>>>> I'm thinking on how to implement the parameters extraction from WellRecord
>>>> objects without the use of scipy.
>>>>
>>>> Here's the link to my branch:
>>>> https://github.com/mgalardini/biopython/tree/phenomics
>>>> The module and functions have been documented taking inspiration from the
>>>> other modules: hope they are clear enough for you to try it out.
>>>> Some example files can be found in Tests/Phenomics.
>>>>
>>>> Marco
>>>
>>> Hi Marco,
>>>
>>> I've not worked with kind of data so my comments are not on
>>> the application specifics. But I'm pleased to see unit tests :)
>>>
>>> One thought was while you define (Java like?) getRow and getColumn
>>> methods, your __getitem__ does not support (NumPy like) access,
>>> which is something we do for multiple sequence alignments. I guess
>>> while most plates are laid out in a grid, the row/column for each
>>> sample is not the most important thing - the sample identifier is?
>>>
>>> Thinking out loud, would properties `rows` and `columns` etc be
>>> nicer than `getRow` and `getColumn`, supporting iteration over
>>> the rows/columns/etc and indexing?
>>
>> Yeah, absolutely: I'll work on some changes to have a more   
>> straightforward way to select multiple WellRecords on row/column   
>> basis.
>>
>>>
>>> Minor: Your longer function docstrings do not follow PEP257,
>>> specifically starting with a one line summary, then a blank line,
>>> then the details. Also you are using triple single-quotes, rather
>>> than triple double-quotes (like the rest of Biopthon).
>>> http://legacy.python.org/dev/peps/pep-0257/
>>
>> Whoops, I'll change it, thanks
>>
>>>
>>> Peter
>>>
>>> P.S. Also, I'm not very keen on the module name, phenomics -
>>> I wonder if it would earn Biopython a badomics award? ;)
>>> http://dx.doi.org/10.1186/2047-217X-1-6
>>
>> That's meta-omics right? :p
>> What about 'Phenotype' then? Maybe it's too general, but future   
>> extensions may include other phenotypic readouts.
>>
>> Marco
>>> _______________________________________________
>>> Biopython-dev mailing list
>>> Biopython-dev at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>>>
>>
>>
>> ----- Fine del messaggio da p.j.a.cock at googlemail.com -----
>>
>>
>>
>> Marco Galardini
>> Postdoctoral Fellow
>> EMBL-EBI - European Bioinformatics Institute
>> Wellcome Trust Genome Campus
>> Hinxton, Cambridge CB10 1SD, UK
>> Phone: +44 (0)1223 49 2547
>>
>>
>> _______________________________________________
>> Biopython-dev mailing list
>> Biopython-dev at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>
> -- 
> -------------------------------------------------
> Marco Galardini, PhD
> Dipartimento di Biologia
> Via Madonna del Piano, 6 - 50019 Sesto Fiorentino (FI)
>
> e-mail: marco.galardini at unifi.it
> www: http://www.unifi.it/dblage/CMpro-v-p-51.html
> phone:  +39 055 4574737
> mobile: +39 340 2808041
> -------------------------------------------------
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev


----- Fine del messaggio da marco.galardini at unifi.it -----



Marco Galardini
Postdoctoral Fellow
EMBL-EBI - European Bioinformatics Institute
Wellcome Trust Genome Campus
Hinxton, Cambridge CB10 1SD, UK
Phone: +44 (0)1223 49 2547




More information about the Biopython-dev mailing list