[Biopython-dev] PopGen code

Fri Jan 12 14:30:26 UTC 2007

Hi,
While I do have a remote interest in this, I do not have any time to
look at this at present. As I mentioned  in a previous email, that
John Cole is doing related work in Python but not part of BioPython.
It would probably be good to have some unified approach and direction
because of the overlaps that occur and so other pieces of code can be
easily added.

Regards
Bruce

On 1/12/07, Tiago Antão <tiagoantao at gmail.com> wrote:
> Hi,
>
> Thanks for the answer.
>
> > I suspect BioPython currently has no active developers who feel
> > qualified to interpret your population genetics code.  I was hoping that
> > you and Ralph Haygood would combine forces - if you are both happy with
> > some code that does bode well.  Any comments Michiel?
>
> I think Ralph (who subscribes to this list, and thus can comment) has
> strong time constraints, and will probably have little available time
> in the near future...
>
> > Regarding population genetic file formats - from a very quick search
> > about Arlequin it sounds like this file format can hold lots of
> > different types of data.  I would encourage you to try and come up with
> > a generic population record data object that could hold this or
> > information from GenePop or Fdist as well.  I have no idea how easy this
> > would be...
>
> I have been thinking a lot about a generic data structure to hold
> population genomic (ie not only genetic) data. I have, in fact,
> implemented (in CAML, not Python) quite a few different data
> representations. I was not happy with none of them. Different kinds of
> markers (that sometimes overlap - eg sequences and SNPs), linkage
> disequilibrium (thus relations between markers...), ploidy (no need to
> think on different organisms, think mitochondria, nuclear chromosomes,
> Y chromosome), ... make a general solution not trivial.
> As I see it, there are a few options:
> 1. Have a grand, unified structure, but that will take time to mature
> 2. Assume that there will be different representations for different
> scopes, assume that that is a bad thing and live with that
> 3. Assume that there will be different representations, and that that
> is good, in the sense that a one size, fits all approach in this case
> has lots of problems
>
> I think the pragmatic approach for now is not to have a generic
> representation. I would lean more to let things mature (develop
> statistics, parsers, ...) and after there is more experience (and,
> hopefully, user feedback) then reassess the issue of a general
> representation. I am aware that this will entail each part of code
> having a different calling data structure, but I think that with care
> and common sense that won't be very problematic.
>
> I don't mind having the code on an alpha branch for as long as you see
> fit, I just want to be sure that whatever effort I put in converting
> (or creating new) my code to BioPython is not lost, that is why I
> would like feedback on what will happen to the code that I am
> submitting. I am willing to accommodate any reasonable requirements
> regarding code quality and development process...
>
> Regards,
> Tiago
>
> --
> Good judgment comes from experience.
> Experience comes from bad judgment.
> - Unknown author
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>