[BioPython] calculate F-Statistics from SNP data

Peter biopython at maubp.freeserve.co.uk
Wed Oct 22 06:34:21 EDT 2008


On Wed, Oct 22, 2008 at 11:25 AM, Giovanni Marco Dall'Olio
<dalloliogm at gmail.com> wrote:
> Maybe the biggest issue is that I will have to use this library to parse
> very big files, so there are a few things we could change in the
> implementation of the parser.
> Is there any way in python to force the interpreter to store variables in
> temporary files instead of RAM memory?
> I was thinking about modules like shelve, cPickle, but I am not sure they
> work in this way.

I have not looked at the specifics here, but adopting an iterator
approach might make sense - returning the entries one by one as parsed
from the file.  This is the idea for the Bio.SeqIO and Bio.AlignIO
parsers.  The user can then turn the entries into a list (if they have
enough memory), filter them as the arrive, etc.  For example, you
could compile a list of only those desired population entries,
discarding the others on the fly.

Peter


More information about the BioPython mailing list