mercoledì 5 giugno 2013

Weighted Means: Part 1


The general idea is that each experiment , with n replicates that can vary experiment by experiment, is exactly performed in the same way, but, as already said, random factors may influence it (the researcher’s state of mind, a different hand in performing it, atmospheric conditions, etc). However, differently from the case in which we have used the number of experiment as random factor, I consider the experiment as statistical unit and not the replicate. Therefore, its statistical power increases with the number of performed experiments (I suggest at least 5, although we will apply our method with our data from three experiments).

Suppose we have k experiments, and each experiment has n replicates that may vary (n1,n2,….nk). Each experiments has its own mean (X1,….,XK ) and its own standard deviation (s1,…,sk) and therefore its own standard error (SE), calculated as SE1=s1/sqrt(n1),…, SEK=sK/sqrt(nK). When we calculate weighted mean, we want to “weight” the general mean of means for the SE of each experiments, giving more weight to those experiments with lowest SE (lower SD, higher n, or both). Therefore, we can calculate a weight for each experiment:

Wi=1/(SEi)^2, i=1,…,k

And use this weight in the calculation of mean of means:

Xbest=(Si=1 to k Xi*Wi)/ (Si=1 to k Wi)

We can also calculate the general SE starting from the standard errors of the experiments:

SE=1/sqrt (Si=1 to k Wi)

Let’s look at our example (not normalized data, with n=5 replicates for each experiment):

 
Controls:
SE1=8.57/sqrt(5)=3.83; SE2=8.50/sqrt(5)=3.80; SE3=14.33/sqrt(5)=6.41 =>
W1=1/SE1^2=0.0682; W2=1/SE2^2=0.0693; W3=1/SE3^2=0.0243 =>
Xbest=(62*0.0682+72.6*0.0693+83.6*0.0243)/(0.0682+0.0693+0.0243)=11.29/0.1618=69.8
The value is different to the crude mean of three means (72.7), as the third experiment has higher variability and therefore less weight on the weighted mean.
Let’s calculate SE:
1/sqrt(0.0682+0.0693+0.0243)=1/0.402=2.488
Note that the statistic unit is the number of experiments, and therefore SD=SE*sqrt(k)=2.488*sqrt(3)=4.309
Again a value completely different from the mean of the SDs of the experiments: 10.47 =>
We conclude that controls have a mean of 69.8 (SD: 4.31) with 2 (k-1) degrees of freedom (df) for each condition.
Other conditions:
C1=89.6 (SD: 4.08)
C2=108.8 (SD: 3.77)
C3=131.8 (SD: 4.10)
C4=139.8 (SD: 4.20)
On this data, we may apply ANOVA and post hoc tests.
IMPORTANT: one may think that the method only depends on the number of experiments, but not on replicates. Not True. The number of replicates determines the SD of the weighted means (high n values highly reduce SD) and therefore the model TAKES INTO ACCOUNT BOTH OF THEM.
To see the comparisons (t-student with Bonferroni’s correction, significance at p=0.0125):
C1 vs C p=0.0045
C2 vs C p<0.001
C3 vs C p<0.001
C4 vs C p<0.001
Substantially in line with what found when we have used the experiment as random factor of we have put all the experiments together, indicating that the method is anyway efficient also with few experiments.
Another possible calculation of SD: It has not solid statistical basis but may be logically reasonable. As we calculate weighted mean, we can calculate weighted SD as:
SDbest=(Si=1 to k SDi*Wi)/ (Si=1 to k Wi) =(Si=1 to k ni/SDi)/ (Si=1 to k Wi)
SD of controls=(5/8.57 + 5/8.50 + 5/14.33)/0.1618 = 9.39
SD of C1 =(5/13.07+5/6.76+5/10.85)/0.181 = 8.75
Therefore, we have to compare C=69.8 (SD: 9.39) and C1=89.6 (SD:8.75), but in this case, being the weighted mean (SD) a sort of “gold standard” experiment (see approximated method), the number of df=n-1=5-1=4 for each condition
Making the comparison: p=0.0087, again significant but less than the previous case despite n>k
NOTE: in this last case, each experiment should have the same n to univocally define df.
Next time we will see the method by Bland and Kerry to have a weighted estimation of SD, comparing the results with those found today.
 
 
 
 
 
 
 


Nessun commento:

Posta un commento