[Biopython] quantile normalization method

Bartek Wilczynski bartek at rezolwenta.eu.org
Sat Mar 20 07:55:20 UTC 2010


On Sat, Mar 20, 2010 at 4:56 AM, Vincent Davis <vincent at vincentdavis.net>wrote:

> Is there a quantile normalization method in biopython, I search but did not
> find. If not it looks straight forward would it be of any interest to the
> community for me to contribute a method
>
> 1. given n arrays of length p, form X of dimension
> p × n where each array is a column;
> 2. sort each column of X to give X sort ;
> 3. take the means across rows of X sort and assign this
> mean to each element in the row to get X sort ;
> 4. get X normalized by rearranging each column of
> X sort to have the same ordering as original X
>
> From
> A comparison of normalization methods for high
> density oligonucleotide array data based on
> variance and bias
> B. M. Bolstad 1,∗, R. A. Irizarry 2, M. Astrand 3 and T. P. Speed 4, 5
> ˚
>

 Hi,

I don't think there is such a method available.

I'm myself using the original R implementation by Bolstad et al. It requires
rPy and R installed. It can be achieved in a few lines of code:

<pre>
import rpy2.robjects as robjects
#ll = list of concatenated values to normalize
v = robjects.FloatVector(ll)
#numrows=number of vectors that made up ll
m = robjects.r['matrix'](v, nrow = numrows, byrow=True)
robjects.r('require("preprocessCore")')
normq=robjects.r('normalize.quantiles')
norm_a=numpy.array(normq(m))
#norm_a=normalized array
 </pre>

If your method is a pure python implementation which is comparably fast I
think it would be worth to have it in Biopython since the method is (in my
opinion) quite useful and it would remove the dependency on R from some of
my scripts.

cheers
 Bartek




More information about the Biopython mailing list