Step-by-step 2 lines

Colada[27]: Step-by-step instruction to fit two lines to an observed u-shaped

To fit two straight lines we need to identify the point where one line ends and the other begins.
In the post we consider using the quadratic everyone is using for testing u-shapes to do so. We provide instructions for doing so below.
There are other options for identifying the interruption point that do not involve the quadratic, and may have some advantages, we do not consider them here.

Six Steps
Step 1. Plot the raw data.
Step 2. Estimate the quadratic regression y=a*x+b*x2
Step 3. Verify the results imply a u-shape within the range of observed data (e.g., plot the quadratic)
Step 4. If u-shape is seen, identify the value of x where u-shape maxes out. Call that point xmax. Simple calculus gets you that that point is

xmax= -a/2b

Step 5. Create new variables to allow an interrupted regression.
We will have a variable for low values of x (xlow), one for high values of x (xhigh) and a new dummy (high).

xlow =x-xmax if x<=xmax, 0 otherwise
xhigh=x-xmax if x>xmax, 0 otherwise
high=1 if x>xmax, 0 otherwise.

That sounds more complicated than it is, imagine x=1,2,3,4,5,6,7,8,9,10 and xmax=5, the data would look like this:

x	xlow	xhigh	high
1	-4	0	0
2	-3	0	0
3	-2	0	0
4	-1	0	0
5	0	0	0
6	0	1	1
7	0	2	1
8	0	3	1
9	0	4	1
10	0	5	1

Step 6. Run the new regression y=c*xlow+d*xhigh+e*high

If c and d have opposite sign, there is a u-shape.
If both are p<.05, there is a statistically significant u-shape.

Sample R Code

#DEMONSTRATION OF FITTING THE TWO LINES TO A QUADRATIC

#Generate x
x=sort(runif(n=300))
x2=x*x

#Genererate error term
e=rnorm(n=300,sd=.05)

#Generate y
y=x-x2+e

#Step 1 Plot data
plot(x,y)

#Step 2 - Run the quadratic
bs=summary(lm(y~x+x2))$coefficients

#Step 3 - Find the point where it maxes out, which is -2b/a where
# y=ax+bx2
a=bs[2,1]      #This is the linear coefficient in the regression results
b=bs[3,1]      #This is the quadratic coefficient in the regression results
xmax=-a/(2*b) #This is the point where the (inverted) u-shape takes its (maximum) minimum value

#Step 4- Generate new predictors with interruption at xmax
#xlow=x-xmax when x<xmax, 0 otherwise
    xlow=ifelse(x<=xmax,x-xmax,0)
#xhigh=x when x<xmax, 0 otherwise
    xhigh=ifelse(x>xmax,x-xmax,0)
#high dummy
    high=ifelse(x>xmax,1,0)

#Step 5 - Run the interrupted regression
    lm1=lm(y~xlow+xhigh+high)
    f1=fitted(lm1)

#Plot the two lines
plot(x,y,main="Two Straight lines",cex.main=1.5,cex=.75)
lines(x[x<xmax],f1[x<xmax],col='red',lwd=3)
lines(x[x>xmax],f1[x>xmax],col='red',lwd=3)
lines(x,fitted(lm(y~x+x2)),col="blue",lwd=1.5)
abline(v=xmax,col=3,lty=3)

#Step 6, verify lines are of oppposite sign and significanct
summary(lm1)