Six Sigma

Regression

Introduction: Regression is a family of curve-fitting methods for  predicting average response performance for new combinations of factors and  understanding which factor changes cause changes in average outputs. Regression methods are perhaps the most widely used statistics or operations research techniques.

Regression:

1.       Also, even though some people think of regression as merely the “curve fitting method” in Excel, the methods are surprisingly subtle with much potential for misuse (and benefit).

2.       Some might call virtually all curve fitting methods “regression” but, more commonly, the term refers to a relatively small set of “linear regression” methods.

3.       In linear regression predictions increase like a first order polynomial in the coefficients. Models fit with terms like β32 x1 2x4 are stilled called “linear” because the term is linear in β32, i.e., if the coefficient β32 increases, the predicted response increases proportionally.

Single Variable Example:

1.       A single input factor and a single response variable with five responses or data. In this case, fitting a first order model is equivalent to fitting a line through the data as shown in Figure (b).

2.       The line shown seems like a good fit in the sense that the (sum squared) distance of the data to the line is minimized. The resulting “best fit” line is –26 32 x1.

3.       The terms “residual” and “estimated error” refer to the deviation of the prediction given by the fitted model and the actual data value.

 

 

The higher the residual, the more concerned one might be that important factors unexplained by the model are influencing the observation in question. These concerns could lead us to fit a different model form and/or to investigate whether the data in questions constitutes an “outlier” that should be removed or changed because it does not represent the system of interest.

The phrase “trend analysis” refers to the application of regression to forecast future occurrences. Such regression modeling constitutes one of the most popular approaches for predicting demand or revenues.

(Demand Trend Analysis)

EXAMPLES:A new product is released in two medium-sized cities. The demands in Month 1 were 28 and 32 units and in Month 2 were 55 and 45 units. Estimate the residuals for a first order regression model and use the model to forecast the

demand in Month 3.

Answer: The best fit line is yest(x1) = 10 20x1. This clearly minimizes almost any measure of the summed residuals, since it passes through the average responses at the two levels. The resulting residuals are –2, 2, 5, and –5. The forecast or

prediction for Month 3 is 10 20 × 3 = 70 units.