First of
all, we cannot choose our best test exclusively on the basis of significance
(e.g. best=more significant). The reason is that we can do two types of errors
(called type I and type II): we see a significance when the observed effect is
actually not significant, we don’t see a significance, when the observed effect
is actually significant. Therefore, we should choice our test based on its reasonability
and statistical correctness:
1) Semi-clinical approach (12-APR): its
use makes sense if each experiment represents a different subject, and
replicates are only a measure of intra-subject variability. The test does not
consider the replicates, but only mean values. For this reason, it tend to
underestimate the significance of the differences and makes the replicates a
secondary point;
2) Simple approach (22-APR): it may be
acceptable with few experiments, as it considers all the replicates together, without
taking into account the experiments. It may be influenced by outlier
experiments, that can increase the sample variance;
3) The experiment as random factor
(9-may): it is a very elegant method to take into account the inter-experiment
variability and with few experiments involving the same cells with a couple of
replicates it appears a GOOD CHOICE; with several experiments, its interpretation
may be difficult and may induce statistical artifacts;
4) An approximated method (13-may): it
is not the best choice, as it tends to underestimate the significance, gives no
weight to experiments, and it is statistically discussable. However, it is very
simple to perform with few simple calculations;
5) Weighted means PART 1 (5-jun): it is
a good choice particularly with a number of experiments (e.g. >3-5). Its
generalization brings to Weighted means PART 2 (6-jun) based on the method
proposed by Bland and Kerry. Attention should be given to the degrees of
freedom to be used for statistical analysis. The use of n-2 df for each
experiment appears a sound choice. The method functions with few and several
experiments, but it appears a GOOD CHOICE with k>3-5 experiments.
Although in
the peer-review process all these aspects are often not considered by the
reviewers and a study is often accepted without these details, I RECOMMEND to
indicate the method of considering both experiments and replicates in
statistical analysis. It is an appreciable benefit.
Finally,
another important consideration: without multiple comparison corrections, the
use of t-student/Mann-Whitney test to compare pairs of data with >2 groups
IS INCORRECT! TAKE CARE! In some cases, it implies the article rejection!!