mathematical models in toxicology: marzo 2013

domenica 31 marzo 2013

A Matter of Normalization

It is quite usual in toxicology to normalize an experimental signal of controls to 1 or 100. It makes the results independent on the particular conditions of experiment itself, and has the advantage to eliminate the inter-experiment variability on the control signal. Let me do it practically with our data, starting from experiment 1. The original data in excel:

Let's calculate the mean of C for experiment 1: 62. Therefore, we can divide all the values relative to experiment 1 by 62. At the same time, we will divide experiment 2 by 72.6 and experiment 3 by 83.6, obtaining the normalized database in figure, when N indicates the normalization:

We have some obvious consequences: the mean of CN is 1 for each experiment. The mean of CN1-CN2-CN3-CN4 indicates a relative fractional increase of effect. We will use the normalized data in further simulations.

venerdì 29 marzo 2013

THE DATA WE WILL USE

Let me introduce a set of fictitious data we will use in all the simulations. They are in arbitrary units, indicating that they can be any short of experimental signal. In this case, I consider an EFFECT and not a RESPONSE or a normalized inhibition (e.g. cell viability, % of death cells, etc) as a more general case easily extensible to it. I consider three experiments with a Negative Control group (e.g. unexposed cells) and four growing concentrations of a toxicant (C1-C2-C3-C4). Each condition has n=5 replicates. Experiment 1: C: 50-65-72-66-57; C1: 102-79-83-95-69; C2: 121-116-99-107-112; C3: 118-130-135-150-115; C4: 121-137-140-155-125. Experiment 2: C: 63-67-81-82-70; C1: 75-89-92-88-90; C2: 99-95-110-89-103; C3: 108-121-105-118-130; C4: 108-121-119-130-135. Experiment 3: C: 101-72-66-88-91; C1: 90-99-89-105-115; C2: 115-132-108-121-116; C3: 130-142-149-130-137; C4: 140-149-160-150-147. Note that the number of replicates is too low to test the normality, but it is a typical case of possible data from literature, with typical n of both experiments and replicates. However, no particular outliers are present along each experiment, indicating that the parametric approach is acceptable. Therefore, we will proceed with parametric statistics. However, I will indicate when possible non-parametric alternatives when the presence of particularly variable values/outliers may make the parametric statistics inadequate.

martedì 5 marzo 2013

A tipical Example from Literaure

A classical way to indicate the procedure in statistical analysis of manuscripts dealing with in vitro toxicology: “Data are shown as mean (SD or SEM) of at least n independent experiments. Each experiment was conducted in n-replicates. P<0.05 was considered significant performing t-test or ANOVA with a given post-hoc test”. This statement however says nothing about how the data are treated. I will show you several different ways with some advantages and disadvantages to take into account or not both the number of experiments and replicates.

domenica 3 marzo 2013

A brief comment about the use of parametric or non-parametric statistics

The general rule is that non-parametric statistics should be applied in the presence of important deviations from normality or log-normality, and in almost all statistical softwares specific tests to assess it exist (e.g. Kolmogorov-Smirnov, Shapiro-Wilk, etc). This general rule is undoubtedly true in the case of clinical and epidemiological data with high sample size. It should be taken into account that, whereas parametric statistics permit several types of multivariate analysis (e.g. n-way ANOVA, ANCOVA, mixed effect models, MANOVA), non-parametric statistics are more limited, and final models of non-parametric multivariate statistics are not widely accepted. Only few programs contain such an extension. What about in vitro toxicology or in vivo toxicology with animals? The number of replicates for each experimental condition is generally limited and, although it is quite uncommon to test the sample size a priori and a gold standard is not generally accepted, it is quite common to have n<10-15 replicates for each condition per experiment. However, for low n values, it is quite difficult to consider a normality test completely reliable. Furthermore, the use of rank tests with low sample size highly underestimates the significance of the differences, with a possible important incidence on beta error (e.g. we don't see a significant difference, although it really exists). Therefore, in my opinion parametric statistics are in these cases preferable, with some limitations: 1) mean or geometric mean and median values are completely different due the presence of outliers. If there is no reason to correct the data, in this case non parametric statistics is preferable also with low n. It should be noted that with animals the inter-subject variability may give outliers, while with in vitro models the phenomenon is better controlled although not completely eliminated. 2) a percentage of measures is below limit of detection (LOD) of the used measurement method. Since the rank calculation implies that all these values have the lowest rank independently by their value (0, LOD, half LOD, LOD/sqrt2), non parametric statistics are not influenced by the assignation of those values. Based on these considerations, we will consider parametric statistics for our further calculations and examples, as generally preferred statistics for in vitro methods with more than one experiment and relatively few replicates.