[BioPython] Help with NaiveBayes

Sam Volchenboum volcs0 at gmail.com
Wed Dec 21 10:31:47 EST 2005


I'm trying to get the NaiveBayes function up and running.

I just can't find any examples out there to learn from (which is how I
usually figure these things out).

I have a set of proteins - say 100 - that are on/off in
health/disease. I have 10 samples each of health and disease. This is
mass spec data.

So, I have a matrix where the rows are proteins (1-100) and the
columns are health/disease (10 each, 20 total), and the cell contents
are 1's and 0's (present/absent).

I want to create a NaiveBayes classifier based on this training data
and see if it predicts health/disease based on a new set of data (a
new set of results for the 100 proteins).

For the training_set, I've tried this format:

[[1, 0, 1, 1, 0, 1], [0, 0, 1, 1, 1], [0, 0, 0, 0, 0]]

which would be an example of three states and five proteins (on or off).

And results like this: ['Healthy', 'Disease', 'Disease']

But I get an error on NaiveBayes.train(training_set, results) - the
two lists need to be the same length (I thought they were... length =
3)...

Any help, advice, push, shove... etc., is greatly appreciated.

Thanks.

sam



More information about the BioPython mailing list