domenica 3 marzo 2013

A brief comment about the use of parametric or non-parametric statistics

The general rule is that non-parametric statistics should be applied in the presence of important deviations from normality or log-normality, and in almost all statistical softwares specific tests to assess it exist (e.g. Kolmogorov-Smirnov, Shapiro-Wilk, etc). This general rule is undoubtedly true in the case of clinical and epidemiological data with high sample size. It should be taken into account that, whereas parametric statistics permit several types of multivariate analysis (e.g. n-way ANOVA, ANCOVA, mixed effect models, MANOVA), non-parametric statistics are more limited, and final models of non-parametric multivariate statistics are not widely accepted. Only few programs contain such an extension. What about in vitro toxicology or in vivo toxicology with animals? The number of replicates for each experimental condition is generally limited and, although it is quite uncommon to test the sample size a priori and a gold standard is not generally accepted, it is quite common to have n<10-15 replicates for each condition per experiment. However, for low n values, it is quite difficult to consider a normality test completely reliable. Furthermore, the use of rank tests with low sample size highly underestimates the significance of the differences, with a possible important incidence on beta error (e.g. we don't see a significant difference, although it really exists). Therefore, in my opinion parametric statistics are in these cases preferable, with some limitations: 1) mean or geometric mean and median values are completely different due the presence of outliers. If there is no reason to correct the data, in this case non parametric statistics is preferable also with low n. It should be noted that with animals the inter-subject variability may give outliers, while with in vitro models the phenomenon is better controlled although not completely eliminated. 2) a percentage of measures is below limit of detection (LOD) of the used measurement method. Since the rank calculation implies that all these values have the lowest rank independently by their value (0, LOD, half LOD, LOD/sqrt2), non parametric statistics are not influenced by the assignation of those values. Based on these considerations, we will consider parametric statistics for our further calculations and examples, as generally preferred statistics for in vitro methods with more than one experiment and relatively few replicates.

Nessun commento:

Posta un commento