[Biopython-dev] Interested in a Phenotype Microarray parser?

Wed Jan 8 10:32:40 UTC 2014

Hi,

On 01/08/2014 06:53 AM, Michiel de Hoon wrote:
>> any specification on the style guide for the biopython parsers?
> There is no strict set of rules, but to get you started, many modules
> follow this format:
> - Assuming a PM data file contains only a single data set, the module
> should contain a function "read" that takes either a file name or a file
> handle as the argument.
Unfortunately, the situation is a bit mixed up: there are basically 
three file formats for PM data: as csv files (which can contain one or 
more data sets or 'plates') and as yaml/json, which can contain also 
some metadata. I would therefore use a similar approach as the SeqIO 
module, having a parse() and a read() method that returns an exception 
if the file contains more than one record.

> - The module should contain a class (typically called "Record") that
> can store the data in the data file. The "read" function returns an
> object of this class.
> - Try to avoid third-party dependencies if at all possible.
So far the dependencies would be pyYaml (for the yaml/json parsing, but 
maybe i could use the stdlib json module) and numpy/scipy for the 
extraction of curve parameters. Does this sound ok?
>
> Would it make sense to have a single Bio.Microarray module that can
> house the various microarray parsers (PM, Affy, others)?
I don't know if that would be a good strategy: the Phenotype Microarrays 
are very different from the other proper microarrays; how about a 
"phenomics" module?

>
> Best,
> -Michiel.
Kind regards,
Marco