martedì 23 luglio 2013

TOPIC 2: Interaction between/among substances in combined experiments


In next months, I will treat the mathematical and statistical models used to study the interactions between/among substances in combined experiments. In other words, I will introduce the methods to treat the data when two or more toxic compounds are used to expose cellular/animal models.

I will reprehend some concepts that I have already published, but I will revise all also in light of some concerns that some scientists have raised about my approach. I will give you all the references necessary to a full comprehension of the topic.

This argument is very complex, and therefore my blog will require some time…

Stay tuned!!

mercoledì 3 luglio 2013

To SUM UP

First of all, we cannot choose our best test exclusively on the basis of significance (e.g. best=more significant). The reason is that we can do two types of errors (called type I and type II): we see a significance when the observed effect is actually not significant, we don’t see a significance, when the observed effect is actually significant. Therefore, we should choice our test based on its reasonability and statistical correctness:

1)      Semi-clinical approach (12-APR): its use makes sense if each experiment represents a different subject, and replicates are only a measure of intra-subject variability. The test does not consider the replicates, but only mean values. For this reason, it tend to underestimate the significance of the differences and makes the replicates a secondary point;

2)      Simple approach (22-APR): it may be acceptable with few experiments, as it considers all the replicates together, without taking into account the experiments. It may be influenced by outlier experiments, that can increase the sample variance;

3)      The experiment as random factor (9-may): it is a very elegant method to take into account the inter-experiment variability and with few experiments involving the same cells with a couple of replicates it appears a GOOD CHOICE; with several experiments, its interpretation may be difficult and may induce statistical artifacts;

4)      An approximated method (13-may): it is not the best choice, as it tends to underestimate the significance, gives no weight to experiments, and it is statistically discussable. However, it is very simple to perform with few simple calculations;

5)      Weighted means PART 1 (5-jun): it is a good choice particularly with a number of experiments (e.g. >3-5). Its generalization brings to Weighted means PART 2 (6-jun) based on the method proposed by Bland and Kerry. Attention should be given to the degrees of freedom to be used for statistical analysis. The use of n-2 df for each experiment appears a sound choice. The method functions with few and several experiments, but it appears a GOOD CHOICE with k>3-5 experiments.

Although in the peer-review process all these aspects are often not considered by the reviewers and a study is often accepted without these details, I RECOMMEND to indicate the method of considering both experiments and replicates in statistical analysis. It is an appreciable benefit.

Finally, another important consideration: without multiple comparison corrections, the use of t-student/Mann-Whitney test to compare pairs of data with >2 groups IS INCORRECT! TAKE CARE! In some cases, it implies the article rejection!!

mercoledì 12 giugno 2013

A little more about Degrees of Freedom


From my posts, it might be argued that the correct choice of the number of degrees of freedom (df) is fundamental, as it determines the significance of the results. It is crucial when we calculate the weighted mean. In my previous post, when we have applied the Bland and Kerry procedure, I’ve said that: “The most immediate thought is k-1 df, which is 2 in our case. However, we should take into account that each experiment has n samples (n-1 df). Therefore, in my opinion the correct way to take into account of df is: df=(k-1)*(n-1) if n is the same in all the experiments”.

It is logically correct, but with an important limitation: what about df if n varies along experiments?? Can we find df in a more general way starting from the formulae we used?

Suppose this simple case: we have an experiment with three replicates (1,2,4) and another with other three replicates (2,3,4). Exp1: Mean=2.33 (SD: 1.53) – Exp2: Mean=3 (SD: 1).

Let’s calculate the weighted mean.

W1=3/2.34=1.28; W2=3/1=3 =>

Weighted mean = (1.28*2.33 + 3*3)/(1.28+3) = 2.80

How can I change my data without varying the weighted mean? How many constraints do we have?

I can change the single means in n-1 ways (two per experiment), but this change influences the SD we use in the formula! To do an example:

To maintain 2.33 as mean in Exp1, I have to change at least two replicates, e.g. 0.5, 2.5, 4. However, SD=1.76 and also the Weighted mean is varying:

(0.99*2.33 + 3*3)/(0.99+3)=2.83 =>

To maintain the weighted mean unaltered, I have to maintain also the same SD in each experiment, and I can do it only varying all the three replicates => IN EACH EXPERIMENT, we have two constraints: mean and SD => each experiment has n-2 df.

To generalize, If I have k experiments and the experiment i has ni replicates, the total number of df is:

Df=Si=1 to k (ni-2) (MATHEMATIC WAY)

In the example of the previous post (k=3 experiments each with 5 replicates), DF=(5-2)+(5-2)+(5-2) = 9 despite of 8 calculated with the “LOGIC METHOD”. The significance of the difference would be 0.0001 despite of 0.0002.

It should be noted that the LOGIC METHOD has more df than the MATHEMATIC one when k is high and n low, the contrary when k is low and n is high, although with several df (e.g. >20-30 per condition) the significances are slowly influenced by this choice.

WHAT is the BEST? The MATHEMATIC method starts from the used formula to calculate weighted mean and is more general, and therefore it is statistically more rigorous.

giovedì 6 giugno 2013

Weighted Means: Part 2


Let’s look at the Bland and Kerry procedure, giving that we have calculated the weighted mean as reported in the previous post. We will see the direct application of the procedure, without details about theory that can be read directly at the bibliographic reference.

Suppose that we have the same data recalled in the previous post.  We use only the difference between Controls and C1, which has the lowest significance.

The weighted mean of controls is 69.8, that of C1 is 89.6.

Firstly, we calculate a term called “Weighted sum of observations squared” (s2)  which is dependent on the number of experiments (k):

s2 = k*(Si=1 to k Xi2*Wi)/ (Si=1 to k Wi)

In our cases:

s2 (C)=3*(622*0.0682+72.62*0.0693+83.62*0.0243)/(0.0682+0.0693+0.0243)=14782.25

s2 (C1)=3*(85.62*0.0266+86.82*0.109+99.62*0.0425)/(0.0266+0.109+0.0425)=24218.09

Then, we have to calculate the correction term, Corr, simply calculated as:

Corr=k*X2weighted

Corr(C)=3*69.82=14616.12 and Corr(C1)=3*89.62=24084.48

Then, we define “Sum of squares about the mean” (S2) as:

S2=s2-Corr =>

S2(C)=166.13 and S2(C1)=133.61

The weighted estimation of SD of each condition is simply:

SDbest=sqrt[S2/(K-1)] =>

SDbest (C) = 9.11 and SDbest(C1) = 8.17

Note that in this case we have some comments to do:

1)  The values are independent on the number of replicates, but depend on the SD values of the single experiments, being it a SD weighted mean;

2) The relationship between SD calculated here and those calculated with the previously presented method was approximately sqrt(5) [some rounding differences];

3)  How many degrees of freedoms we have? The most immediate thought is k-1 df, which is 2 in our case. However, we should take into account that each experiment has n samples (n-1 df). Therefore, in my opinion the correct way to take into account of df is: df=(k-1)*(n-1) if n is the same in all the experiments. In our case, df=8. Why the multiplication? Because we want that each experiment has a constraint, and therefore we introduce an AND among the assumptions on our experiments, and therefore the probabilities are multiplicative. It solves also the point 1.

Therefore,  we have to compare C=69.8 (SD: 9.11) and C1=89.6 (SD: 8.17) with 8 df per group, which implies a significance of p=0.0002, lower than that obtained with the calculations reported in my previous post. On the contrary, if we retain that n-1 should not be considered as df, we may use only k-1 df (2 in this case) with a very lower significance (p=0.049). In this case, the number of experiments should be well > 3 to control the beta error.

In the next post, we will summarize all the results.

mercoledì 5 giugno 2013

Weighted Means: Part 1


The general idea is that each experiment , with n replicates that can vary experiment by experiment, is exactly performed in the same way, but, as already said, random factors may influence it (the researcher’s state of mind, a different hand in performing it, atmospheric conditions, etc). However, differently from the case in which we have used the number of experiment as random factor, I consider the experiment as statistical unit and not the replicate. Therefore, its statistical power increases with the number of performed experiments (I suggest at least 5, although we will apply our method with our data from three experiments).

Suppose we have k experiments, and each experiment has n replicates that may vary (n1,n2,….nk). Each experiments has its own mean (X1,….,XK ) and its own standard deviation (s1,…,sk) and therefore its own standard error (SE), calculated as SE1=s1/sqrt(n1),…, SEK=sK/sqrt(nK). When we calculate weighted mean, we want to “weight” the general mean of means for the SE of each experiments, giving more weight to those experiments with lowest SE (lower SD, higher n, or both). Therefore, we can calculate a weight for each experiment:

Wi=1/(SEi)^2, i=1,…,k

And use this weight in the calculation of mean of means:

Xbest=(Si=1 to k Xi*Wi)/ (Si=1 to k Wi)

We can also calculate the general SE starting from the standard errors of the experiments:

SE=1/sqrt (Si=1 to k Wi)

Let’s look at our example (not normalized data, with n=5 replicates for each experiment):

 
Controls:
SE1=8.57/sqrt(5)=3.83; SE2=8.50/sqrt(5)=3.80; SE3=14.33/sqrt(5)=6.41 =>
W1=1/SE1^2=0.0682; W2=1/SE2^2=0.0693; W3=1/SE3^2=0.0243 =>
Xbest=(62*0.0682+72.6*0.0693+83.6*0.0243)/(0.0682+0.0693+0.0243)=11.29/0.1618=69.8
The value is different to the crude mean of three means (72.7), as the third experiment has higher variability and therefore less weight on the weighted mean.
Let’s calculate SE:
1/sqrt(0.0682+0.0693+0.0243)=1/0.402=2.488
Note that the statistic unit is the number of experiments, and therefore SD=SE*sqrt(k)=2.488*sqrt(3)=4.309
Again a value completely different from the mean of the SDs of the experiments: 10.47 =>
We conclude that controls have a mean of 69.8 (SD: 4.31) with 2 (k-1) degrees of freedom (df) for each condition.
Other conditions:
C1=89.6 (SD: 4.08)
C2=108.8 (SD: 3.77)
C3=131.8 (SD: 4.10)
C4=139.8 (SD: 4.20)
On this data, we may apply ANOVA and post hoc tests.
IMPORTANT: one may think that the method only depends on the number of experiments, but not on replicates. Not True. The number of replicates determines the SD of the weighted means (high n values highly reduce SD) and therefore the model TAKES INTO ACCOUNT BOTH OF THEM.
To see the comparisons (t-student with Bonferroni’s correction, significance at p=0.0125):
C1 vs C p=0.0045
C2 vs C p<0.001
C3 vs C p<0.001
C4 vs C p<0.001
Substantially in line with what found when we have used the experiment as random factor of we have put all the experiments together, indicating that the method is anyway efficient also with few experiments.
Another possible calculation of SD: It has not solid statistical basis but may be logically reasonable. As we calculate weighted mean, we can calculate weighted SD as:
SDbest=(Si=1 to k SDi*Wi)/ (Si=1 to k Wi) =(Si=1 to k ni/SDi)/ (Si=1 to k Wi)
SD of controls=(5/8.57 + 5/8.50 + 5/14.33)/0.1618 = 9.39
SD of C1 =(5/13.07+5/6.76+5/10.85)/0.181 = 8.75
Therefore, we have to compare C=69.8 (SD: 9.39) and C1=89.6 (SD:8.75), but in this case, being the weighted mean (SD) a sort of “gold standard” experiment (see approximated method), the number of df=n-1=5-1=4 for each condition
Making the comparison: p=0.0087, again significant but less than the previous case despite n>k
NOTE: in this last case, each experiment should have the same n to univocally define df.
Next time we will see the method by Bland and Kerry to have a weighted estimation of SD, comparing the results with those found today.
 
 
 
 
 
 
 


lunedì 27 maggio 2013

The Weighted Means: Essential Bibliography

In the next weeks I will treat the last and more complicated method to analyze the data: the weighted mean. The general hypothesis is that a relatively high number of experiment will be performed (my suggestion: n>5, although it is not so restrictive), and so the method is complimentary to that treated as simple approach – experiment as a random factor.

I have partially treated the argument in:

Goldoni M, Tagliaferri S. Dose-response or dose-effect curves in in vitro experiments and their use to study combined effects of neurotoxicants. Methods Mol Biol. 2011;758:415-34. doi: 10.1007/978-1-61779-170-3_28. Review. PubMed PMID: 21815082  

But I will re-treat the argument step by step to completely explain all the passages. Most of them are available in this essential bibliography:

1)      JR Taylor. An introduction to error analysis:  the study of uncertainties in physical measurements. University Science Books, 1997.

I will adapt the passages proposed for physical measures to toxicology (Chapter in the book entitled: Weighted Means, chapter 7 in the 2nd edition).

2)      Bland, J. M., and Kerry, S. M. (1998) Weighted comparison of means, BMJ. 316, 129.

This is a very brief but  interesting lecture taken from clinics. http://www.bmj.com/content/316/7125/129

Good Luck!!

lunedì 13 maggio 2013

An Approximated Method


This method is quite simple to perform, but it does not take into account the number of experiments and it is statistically discussable. It may be used in all the conditions and presents some similarities with the semi-clinical approach (see one of my previous posts).

If we have k experiments with n replicates (n is the same for all the experiments), we may think to calculate the best possible experiment giving to all the experiments the same weight:

-the mean is the mean of k experiments;

-the SD is the mean of SD of k experiments.

In this way, we extrapolate a sort of “mean experiment”, where the number of degree of freedoms (df) remains n-1 for each condition. This method is “dangerous” if we have a low number of replicates and several experiments, and statistically is discussable because it does not take into account all the possible df of our system and gives to all the experiments the same weight.

Let’s look at the data (crude values):

 
 
 The last line represents the values we have to compare, considering n=5 replicates for each conditions. In the statistical softwares, there aren’t general options to compare mean (SD) with ANOVA only knowing that values and n, but we can use some sites that perform such a calculation. See for example:
 

In alternative, some softwares perform t-Student tests. We can assess the differences between pairs of conditions using t-Student tests, but we have to consider that we have to apply the Bonferroni’s correction to the p value (true significance is 0.05/m, where m is the number of performed multiple comparisons). The results:
 

To perform the comparisons between exposed conditions and control (software OPENSTAT available at http://www.statprograms4u.com/ or the previous site with pairs of column):
0 vs 1 – p=0.0253
0 vs 2 – p=0.0003
0 vs 3 – p<0.0001
0 vs 4 – p<0.0001
Considering a significant p=0.05/4=0.0125, the condition 1 is the unique not significant condition, and therefore the results are intermediate between A SIMPLE APPROACH with and without random factors (all significant) and the SEMI-CLINICAL approach (no significances). However, this method, although very simplified and discussable, controls fairly the beta error.