[BioPython] calculate F-Statistics from SNP data

Peter biopython at maubp.freeserve.co.uk
Thu Oct 23 17:01:26 UTC 2008


Giovanni wrote:
> So, how should we modify the current GenePop parser to make it work as an
> iterator?

I think this would mean breaking up the current Record object (which
holds everything) into sub-records which can be yielded one by one.
This would require an API change, unless you wanted to continue to
offer the two approaches in parallel (not elegant, but see
Bio/Sequencing/Ace.py for an example of where this made sense to do).

> Now it has a 'Scanner' and 'Consumer' methods. Should I remove them and
> write a RecordIterator instead?
> ...
> Can you explain me more or less how the 'Consumer' object works? It is
> mandatory to use it when creating biopython objects?

You can write an iterator with or without the Scanner/Consumer style of parser.

The Scanner/Consumer system is very flexible if you want to parse the
data into different objects (by using different consumers).  In theory
the end user could also use the provided scanner with their own
consumer.  However, in my opinion for parsing sequence file formats
this was overkill (needlessly complicated) - as only one object is
really needed to represent a sequence (we have the SeqRecord for
this), so most of the recent parsers in Bio.SeqIO and Bio.AlignIO do
not use the scanner/consumer setup.

See also the short Tutorial section "Parser Design".
http://biopython.org/DIST/docs/tutorial/Tutorial.html

For population genetics given there is no one universal record object,
perhaps the flexibility of the Scanner/Consumer system is worth while.
 On the other hand, Tiago currently has the scanner/consumer in
Bio.PopGen.GenePop as private objects so this is currently a private
implementation detail - one could replace the Scanner/Consumer details
without breaking the public API.

Peter



More information about the Biopython mailing list