[BioPython] Bio.kNN documentation

Peter biopython at maubp.freeserve.co.uk
Mon Oct 6 10:39:15 UTC 2008


Bruce wrote:
>>> I did not see an examples for k-nearest neighbor so below is
>>> (very bad) code using the logistic regression example
>>> (http://biopython.org/DIST/docs/cookbook/LogisticRegression.html).

Peter wrote:
>> This is a set of Bacillus subtilis gene pairs for which the operon
>> structure is known, with the intergene distance and gene expression
>> score as explanatory variables, with the class being same operon or
>> different operons.
>> ...
>> Coupled with a scatter plot (say with pylab, showing the two classes
>> in different colours), this could be turned into a nice little example
>> for the cookbook section of the tutorial.  Notice that later on in the
>> logistic regression example there is a second table of "test data"
>> which could be used to make de novo predictions.

Bruce wrote:
> I did realize that this was coming... :-)
> (I guess I am volunteering myself to provide some material on
> machine learning with BioPython. So this is a start.)

Michiel has suggested adding a whole chapter to the tutorial about
supervised learning, presumably incorporating his logistic regression
example as part of this.  Have a look at thread "Bio.MarkovModel;
Bio.Popgen, Bio.PDB documentation" on the dev mailing list.  I'm sure
you can contribute (even if just by proof reading).

Peter



More information about the Biopython mailing list