instruction to fit two lines to an observed u-shaped
To fit two straight lines we need to identify the point where one
line ends and the other begins.
In the post we consider using the quadratic everyone is using for
testing u-shapes to do so. We provide instructions for doing so
There are other options for identifying the interruption point that
do not involve the quadratic, and may have some advantages, we do
not consider them here.
Step 1. Plot the raw data.
Step 2. Estimate the
quadratic regression y=a*x+b*x2
Step 3. Verify the results
imply a u-shape within the range of observed data (e.g., plot the
Step 4. If u-shape is seen,
identify the value of x where u-shape maxes out. Call that point xmax. Simple calculus gets you
that that point is
Step 5. Create new variables
to allow an interrupted regression.
We will have a variable for low values of x (xlow), one for high
values of x (xhigh) and a new dummy (high).
xlow =x-xmax if x<=xmax, 0
xhigh=x-xmax if x>xmax, 0 otherwise
if x>xmax, 0 otherwise.
That sounds more complicated than it is, imagine
x=1,2,3,4,5,6,7,8,9,10 and xmax=5, the data would look like
Step 6. Run the new
If c and d have opposite sign, there is a u-shape.
If both are p<.05, there is a statistically significant u-shape.
Sample R Code
#DEMONSTRATION OF FITTING THE TWO LINES TO A QUADRATIC
#Genererate error term
#Step 1 Plot data
#Step 2 - Run the quadratic
#Step 3 - Find the point where it maxes out, which is -2b/a where
a=bs[2,1] #This is the linear
coefficient in the regression results
b=bs[3,1] #This is the
quadratic coefficient in the regression results
xmax=-a/(2*b) #This is the point where the (inverted)
u-shape takes its (maximum) minimum value
#Step 4- Generate new predictors with interruption at xmax
#xlow=x-xmax when x<xmax, 0 otherwise
#xhigh=x when x<xmax, 0 otherwise
#Step 5 - Run the interrupted regression
#Plot the two lines
plot(x,y,main="Two Straight lines",cex.main=1.5,cex=.75)
#Step 6, verify lines are of oppposite sign and significanct