Response
Surface Analysis Using SPSS
In my
published work, I have conducted response surface analyses using SYSTAT. However, SYSTAT is less popular than SPSS,
and people who ask me questions about response surface methodology often use
SPSS for their research. This page
provides guidelines for conducting response surface analyses using SPSS,
focusing on the following quadratic polynomial regression equation.
(1) Z = b0
+ b1X + b2Y + b3X2 + b4XY
+ b5Y2 + e
Eq. 1 can
be estimated using the REGRESSION or GLM modules of SPSS. The syntax given here uses the GLM procedure,
which has algorithms that can be used to test response surface features. The GLM syntax for estimating the equation
is:
GLM
z WITH x y x2 xy y2
/METHOD = SSTYPE(3)
/INTERCEPT = INCLUDE
/PRINT = DESCRIPTIVE PARAMETER
/DESIGN = x y x2 xy y2 .
where z
is the dependent variable, x and y are scale-centered component measures, and
x2, xy, and y2 are the three quadratic terms formed from x and y. This syntax will produce coefficient
estimates and their associated F-tests as well as an F-test for the overall
model (i.e., the test of the R2 from the regression equation). Now, suppose we wanted to test the shape of
the surface corresponding to the quadratic regression equation along the Y = X
line. As explained elsewhere (Edwards,
2002; Edwards & Parry, 1993), the shape of the surface along this line can
be found by substituting Y = X into Eq. 1 and simplifying:
(2) Z = b0
+ b1X + b2X + b3X2 + b4XX
+ b5X2 + e
= b0 + (b1 + b2)X
+ (b3 + b4 + b5)X2 + e
The
linear combinations of coefficients that precede X and X2 can be
tested using an LMATRIX statement into the GLM code:
GLM
z WITH x y x2 xy y2
/METHOD = SSTYPE(3)
/INTERCEPT = INCLUDE
/PRINT = DESCRIPTIVE PARAMETER
/LMATRIX = x 1 y 1; x2 1 xy 1 y2 1
/DESIGN = x y x2 xy y2 .
The
LMATRIX statement forms linear combinations of coefficients using the weights
following each variable. For instance, the
expression “x 1 y 1” assigns unit weights to the coefficients on x and y, which
yields b1 + b2.
Likewise, the expression “x2 1 xy 1 y2 1” assigns unit weights to x2,
xy, and y2, yielding b3 + b4 + b5. The semicolon between these two expressions
instructs SPSS to test the two linear combinations jointly. Thus, the resulting test has two numerator
degrees of freedom and evaluates whether the shape of the surface along the Y =
X line has no slope or curvature, meaning the surface is flat along this line.
A
similar approach can be used to test the shape of the surface along the Y = -X
line. The shape of the surface along
this line can be derived by substituting Y = -X into Eq. 1 and simplifying:
(3) Z = b0
+ b1X - b2X + b3X2 - b4XX
+ b5X2 + e
= b0 + (b1 - b2)X
+ (b3 - b4 + b5)X2 + e
The
corresponding GLM syntax is:
GLM
z WITH x y x2 xy y2
/METHOD = SSTYPE(3)
/INTERCEPT = INCLUDE
/PRINT = DESCRIPTIVE PARAMETER
/LMATRIX = x 1 y -1; x2 1 xy -1 y2 1
/DESIGN = x y x2 xy y2 .
The
LMATRIX statement again forms linear combinations of coefficients using the
numbers that follow each variable. In
this case, the linear combinations are (b1 - b2) and (b3
- b4 + b5), which are jointly tested for deviation from
zero (i.e., flatness along the Y = -X line).
The
LMATRIX statements shown above can be simplified to test specific aspects of
the shape of the surface along the Y = X and Y = -X lines. For example, to test the curvature of the
surface along the Y = -X line, the expression “x 1 y -1;” can be dropped from
the LMATRIX line, which yields:
GLM
z WITH x y x2 xy y2
/METHOD = SSTYPE(3)
/INTERCEPT = INCLUDE
/PRINT = DESCRIPTIVE PARAMETER
/LMATRIX = x2 1 xy -1 y2 1
/DESIGN = x y x2 xy y2 .
The resulting
F-test will have one numerator degree of freedom, given that only one linear
combination of coefficients (i.e., b3 - b4 + b5)
is being tested.
The
procedures described above are appropriate only when testing linear
combinations of coefficients. To test
nonlinear combinations of coefficients, such as those involved in testing the
locations of the stationary point and principal axes, the bootstrap should be
used (Edwards, 2002), which can be implemented using the CLNR module of
SPSS. To illustrate, consider the
following syntax, which applies to the quadratic equation for travel shown in
Table 11.3 of Edwards (2002):
SET
RNG=MT MTINDEX=54321 .
MODEL
PROGRAM b0=5.964 b1=0.247 b2=-0.131 b3=-0.130 b4=0.231 b5=-0.104 .
COMPUTE
PRED = b0 + b1*tvlca + b2*tvlcd + b3*tvlca2 + b4*tvlcad + b5*tvlcd2 .
CNLR
sat
/OUTFILE='C:\MYDATA\TVLBOOT.SAV'
/BOOTSTRAP=10000
.
The
first line calls the Mersenne twister random number generator and sets the seed
used to draw the bootstrap samples to 54321.
By setting a seed, the same bootstrap samples will be drawn each time
you run the analysis, such that you can repeat your analysis at a later
time. You can choose any number for the
seed.
The
second line specifies starting values for the regression coefficients that will
be estimated with the bootstrap values.
A sensible set of starting values are the coefficient estimates obtained
from the full sample, which I have entered in this line.
The
third line specifies the predictors in the regression equation. The terms tvlca, tvlcd, tvlca2, tvlcad, and
tvlcd2 are variables in the data and correspond to X, Y, X2, XY, and
Y2 in Table 11.3 of Edwards (2002).
The
fourth line calls the CNLR procedure and specifies the dependent variable. In this case, the variable is sat, which
refers to a job satisfaction variable in the data.
The
fifth line indicates the name and location of the file that will contain the
bootstrap estimates. I called the file
TVLBOOT.SAV and put it in the folder MYDATA.
The
last line indicates how many bootstrap samples to draw. In this case ,the number of bootstrap samples
is 10,000.
After
these commands are executed, the file TVLBOOT.SAV will contain 10,001 sets of
coefficients for the quadratic equation.
The coefficients will appear as rows of data in the file, which can be
read as a data file by SPSS. The first
set contains the quadratic coefficients for the full sample, and the remaining
10,000 sets are the coefficients produced by the bootstrap samples. The SPSS data file I generated when I executed
these commands is here, and coefficients are saved as
an Excel file here. These coefficients can be used to construct
confidence intervals for nonlinear combinations of regression coefficients
involved in response surface analysis.
This procedure is demonstrated by the Excel file given here, which explains the structure of the file in the
READ ME tab and uses the coefficients from TVLBOOT.SAV in the SPSS tab.
You can
adapt this syntax and the associated files for your own purposes. Before you proceed, I strongly
recommended that you read the sources cited in the READ ME tab of the Excel file
and carefully study the structure of the file as described in the READ ME
tab. Taking these steps should help you
become self-sufficient with this procedure.
Edwards, J. R. (2002). Alternatives to difference scores: Polynomial regression analysis and response surface methodology. In F. Drasgow & N. W. Schmitt (Eds.), Advances in measurement and data analysis (pp. 350-400). San Francisco: Jossey-Bass.
Edwards, J. R., & Parry, M. E. (1993). On the use of polynomial regression equations as an alternative to difference scores in organizational research. Academy of Management Journal, 36, 1577-1613.