Appendix for DataColada[39].
Derivation that if
test-retest correlation
for a dependent variable is r<.5,
subtracting baseline lowers
power.
By Uri Simonsohn
June 17, 2015
Let’s consider a two-cell design,
treatment vs
control, with dependent variable: y
Let y2t and y2c
be the means for treatment and control respectively in the after period.
Let y1t and y1c
be the means for treatment and control respectively in the before period.
The between subject difference
(1) B= y2t - y2c
The mixed-design test subtracts the
baseline
(2) M=
y2t - y2c - (y1t
– y1c)
baseline
The expected difference is the same,
E(B)=E(M), because
with random assignment we have E(y1t
– y1c)=0
This makes sense, we don’t expect
differences at
baseline, so we expect the same with B or M
How about the standard error of B and M?
Let’s make things easy. Assume all
variances are the
same:
(3) VAR(y2t)=VAR(y1t)=VAR(y2c)=VAR(y1c)=V
(4) COV(y2t,
y1t)=COV(y2c, y1c)=C
(note: because of random assignment COV(y2c,
y2t)= COV(y1c, y1t)=0)
Recall the high-school formula for
variance of sum of
random variables:
(5) VAR(a-b)=VAR(a)+VAR(b)-2COV(a,b)
We want to compute the variance of the B
(between) and
M (mixed design) estimates:
VAR(B)=VAR[(y2t
- y2c)]]
=2V
VAR(M)=VAR[y2t
- y2c -
(y1t
– y1c)]
4V-4C
Mixed and Between subject design have the
same sample
size and the same effect size, hence Mixed has more power iff its
variance is
smaller than Between’s.
For VAR(B)>VAR(M) we need
2V>4V-4C
Which occurs if
4C>2V
Which occurs if
C/V>1/2
C/V, the covariance over the variance, is
the
correlation, so:
The Mixed design has a smaller variance
and hence
greater power iff r>.5