[Open-bio-l] Common Sample Data Collection, was: SCF files (Staden)

Peter Rice pmr at ebi.ac.uk
Wed Nov 30 06:04:49 EST 2011


On 11/30/2011 10:41 AM, Peter Cock wrote:
> On Wed, Nov 30, 2011 at 10:30 AM, Peter Rice<pmr at ebi.ac.uk>  wrote:
>>
>> BioLib is just swig wrappers around the existing Bio* interfaces and
>> code, so it will not help in this case if the projects are too divergent.
>>
>> Could we set up a Bio* collection of data formats with examples and
>> note which projects can handle each one?
>>
>> We do not need any one project to cover everything - we can reasonably
>> expect users to use some other project to interconvert formats if there are
>> gaps.
>
> Good plan. I suggest we make a repository on github, perhaps
> bio-data or something like that, under the recently created OBF
> account, https://github.com/OBF
>
> Peter R - do you have a GitHub account yet? If so we (me,
> Chris Field, etc) can give you access to the OBF org account.

No ... rather a pain that EMBOSS got used. I've register under some 
other name: EMBOSSTEAM and created an EMBOSS project under it.

Looks like git import requires subversion for any automation. Preumably 
I need a fresh EMBOSS checkout from CVS and then commit everything by 
hand ... best done after the release 6.5.0 code freeze.

> For licensing, where we are free to choose the licence, I would
> like to go with something as liberal as possible to allow the
> files to be used by any OSS project (or closed source project),
> (e.g. Public Domain, CC0, MIT/BSD) rather than something
> more principled but restricted like CC-BY or CC-BY-ND.

Public domain would be my choice - we don't want to cause conflicts if 
any data is imported into other projects (e.g. as test cases)

> However, as we know from recent Debian packaging
> discussion about test cases taken from UniProt, licensing
> and copyright of samples from a database is complicated.
> Here we must at least keep careful records about where
> data came from.

For that reason we probably should fake all the files for the public 
database formats.

regards,

Peter Rice
EMBOSS team


More information about the Open-Bio-l mailing list