Colada39 Appendix

Appendix for DataColada[39].

Derivation that if test-retest correlation for a dependent variable is r<.5,

subtracting baseline lowers power.

By Uri Simonsohn

June 17, 2015

Let’s consider a two-cell design, treatment vs control, with dependent variable: y

Let y₂^t and y₂^c be the means for treatment and control respectively in the after period.

Let y₁^t and y₁^c be the means for treatment and control respectively in the before period.

The between subject difference

(1) B= y₂^t - y₂^c

The mixed-design test subtracts the baseline

(2) M= y₂^t - y₂^c - (y₁^t – y₁^c)

baseline

The expected difference is the same, E(B)=E(M), because with random assignment we have E(y₁^t – y₁^c)=0

This makes sense, we don’t expect differences at baseline, so we expect the same with B or M

How about the standard error of B and M?

Let’s make things easy. Assume all variances are the same:

(3) VAR(y₂^t)=VAR(y₁^t)=VAR(y₂^c)=VAR(y₁^c)=V

(4) COV(y₂^t, y₁^t)=COV(y₂^c, y₁^c)=C

(note: because of random assignment COV(y₂^c, y₂^t)= COV(y₁^c, y₁^t)=0)

Recall the high-school formula for variance of sum of random variables:

(5) VAR(a-b)=VAR(a)+VAR(b)-2COV(a,b)

We want to compute the variance of the B (between) and M (mixed design) estimates:

VAR(B)=VAR[(y₂^t - y₂^c)]]

=2V

VAR(M)=VAR[(Y2-Y1) – (X2-X1)]

4V-4C

Mixed and Between subject design have the same sample size and the same effect size, hence Mixed has more power iff its variance is smaller than Between’s.

For VAR(B)>VAR(M) we need

2V>4V-4C

Which occurs if

4C>2V

Which occurs if

C/V>1/2

C/V, the covariance over the variance, is the correlation, so:

The Mixed design has a smaller variance and hence greater power iff r>.5