Response Surface Analysis Using SPSS

 

In my published work, I have conducted response surface analyses using SYSTAT. However, SYSTAT is less popular than SPSS, and people who ask me questions about response surface methodology often use SPSS for their research. This page provides guidelines for conducting response surface analyses using SPSS, focusing on the following quadratic polynomial regression equation.

 

(1) Z = b0 + b1X + b2Y + b3X2 + b4XY + b5Y2 + e

 

Eq. 1 can be estimated using the REGRESSION or GLM modules of SPSS. The syntax given here uses the GLM procedure, which has algorithms that can be used to test response surface features. The GLM syntax for estimating the equation is:

 

GLM

z WITH x y x2 xy y2

/METHOD = SSTYPE(3)

/INTERCEPT = INCLUDE

/PRINT = DESCRIPTIVE PARAMETER

/DESIGN = x y x2 xy y2 .

 

where z is the dependent variable, x and y are scale-centered component measures, and x2, xy, and y2 are the three quadratic terms formed from x and y. This syntax will produce coefficient estimates and their associated F-tests as well as an F-test for the overall model (i.e., the test of the R2 from the regression equation). Now, suppose we wanted to test the shape of the surface corresponding to the quadratic regression equation along the Y = X line. As explained elsewhere (Edwards, 2002; Edwards & Parry, 1993), the shape of the surface along this line can be found by substituting Y = X into Eq. 1 and simplifying:

 

(2) Z = b0 + b1X + b2X + b3X2 + b4XX + b5X2 + e

 

= b0 + (b1 + b2)X + (b3 + b4 + b5)X2 + e

 

The linear combinations of coefficients that precede X and X2 can be tested using an LMATRIX statement into the GLM code:

 

GLM

z WITH x y x2 xy y2

/METHOD = SSTYPE(3)

/INTERCEPT = INCLUDE

/PRINT = DESCRIPTIVE PARAMETER

/LMATRIX = x 1 y 1; x2 1 xy 1 y2 1

/DESIGN = x y x2 xy y2 .

 

The LMATRIX statement forms linear combinations of coefficients using the weights following each variable. For instance, the expression x 1 y 1 assigns unit weights to the coefficients on x and y, which yields b1 + b2. Likewise, the expression x2 1 xy 1 y2 1 assigns unit weights to x2, xy, and y2, yielding b3 + b4 + b5. The semicolon between these two expressions instructs SPSS to test the two linear combinations jointly. Thus, the resulting test has two numerator degrees of freedom and evaluates whether the shape of the surface along the Y = X line has no slope or curvature, meaning the surface is flat along this line.

 

A similar approach can be used to test the shape of the surface along the Y = -X line. The shape of the surface along this line can be derived by substituting Y = -X into Eq. 1 and simplifying:

 

(3) Z = b0 + b1X - b2X + b3X2 - b4XX + b5X2 + e

 

= b0 + (b1 - b2)X + (b3 - b4 + b5)X2 + e

 

The corresponding GLM syntax is:

 

GLM

z WITH x y x2 xy y2

/METHOD = SSTYPE(3)

/INTERCEPT = INCLUDE

/PRINT = DESCRIPTIVE PARAMETER

/LMATRIX = x 1 y -1; x2 1 xy -1 y2 1

/DESIGN = x y x2 xy y2 .

 

The LMATRIX statement again forms linear combinations of coefficients using the numbers that follow each variable. In this case, the linear combinations are (b1 - b2) and (b3 - b4 + b5), which are jointly tested for deviation from zero (i.e., flatness along the Y = -X line).

 

The LMATRIX statements shown above can be simplified to test specific aspects of the shape of the surface along the Y = X and Y = -X lines. For example, to test the curvature of the surface along the Y = -X line, the expression x 1 y -1; can be dropped from the LMATRIX line, which yields:

 

GLM

z WITH x y x2 xy y2

/METHOD = SSTYPE(3)

/INTERCEPT = INCLUDE

/PRINT = DESCRIPTIVE PARAMETER

/LMATRIX = x2 1 xy -1 y2 1

/DESIGN = x y x2 xy y2 .

 

The resulting F-test will have one numerator degree of freedom, given that only one linear combination of coefficients (i.e., b3 - b4 + b5) is being tested.

 

The procedures described above are appropriate only when testing linear combinations of coefficients. To test nonlinear combinations of coefficients, such as those involved in testing the locations of the stationary point and principal axes, the bootstrap should be used (Edwards, 2002), which can be implemented using the CLNR module of SPSS. To illustrate, consider the following syntax, which applies to the quadratic equation for travel shown in Table 11.3 of Edwards (2002):

 

SET RNG=MT MTINDEX=54321 .

MODEL PROGRAM b0=5.964 b1=0.247 b2=-0.131 b3=-0.130 b4=0.231 b5=-0.104 .

COMPUTE PRED = b0 + b1*tvlca + b2*tvlcd + b3*tvlca2 + b4*tvlcad + b5*tvlcd2 .

CNLR sat

/OUTFILE='C:\MYDATA\TVLBOOT.SAV'

/BOOTSTRAP=10000 .

 

The first line calls the Mersenne twister random number generator and sets the seed used to draw the bootstrap samples to 54321. By setting a seed, the same bootstrap samples will be drawn each time you run the analysis, such that you can repeat your analysis at a later time. You can choose any number for the seed.

 

The second line specifies starting values for the regression coefficients that will be estimated with the bootstrap values. A sensible set of starting values are the coefficient estimates obtained from the full sample, which I have entered in this line.

 

The third line specifies the predictors in the regression equation. The terms tvlca, tvlcd, tvlca2, tvlcad, and tvlcd2 are variables in the data and correspond to X, Y, X2, XY, and Y2 in Table 11.3 of Edwards (2002).

 

The fourth line calls the CNLR procedure and specifies the dependent variable. In this case, the variable is sat, which refers to a job satisfaction variable in the data.

 

The fifth line indicates the name and location of the file that will contain the bootstrap estimates. I called the file TVLBOOT.SAV and put it in the folder MYDATA.

 

The last line indicates how many bootstrap samples to draw. In this case ,the number of bootstrap samples is 10,000.

 

After these commands are executed, the file TVLBOOT.SAV will contain 10,001 sets of coefficients for the quadratic equation. The coefficients will appear as rows of data in the file, which can be read as a data file by SPSS. The first set contains the quadratic coefficients for the full sample, and the remaining 10,000 sets are the coefficients produced by the bootstrap samples. The SPSS data file I generated when I executed these commands is here, and coefficients are saved as an Excel file here. These coefficients can be used to construct confidence intervals for nonlinear combinations of regression coefficients involved in response surface analysis. This procedure is demonstrated by the Excel file given here, which explains the structure of the file in the READ ME tab and uses the coefficients from TVLBOOT.SAV in the SPSS tab.

 

You can adapt this syntax and the associated files for your own purposes. Before you proceed, I strongly recommended that you read the sources cited in the READ ME tab of the Excel file and carefully study the structure of the file as described in the READ ME tab. Taking these steps should help you become self-sufficient with this procedure.

 

Edwards, J. R.  (2002).  Alternatives to difference scores: Polynomial regression analysis and response surface methodology. In F. Drasgow & N. W. Schmitt (Eds.), Advances in measurement and data analysis (pp. 350-400).  San Francisco: Jossey-Bass.

 

Edwards, J. R., & Parry, M. E.  (1993).  On the use of polynomial regression equations as an alternative to difference scores in organizational research. Academy of Management Journal, 36, 1577-1613.