# Statistics In Six Sigma

**Summary Statistics:** In addition to correlation matrices and residual plots, several numbers called “**summary statistics**” provide often critical information about the adequacy of the model form in question. This section describes four summary statistics: R_{2 }adjusted, PRESS, R_{2} Prediction, and σ* _{est}*. Probably the most widely used summary statistic is the “

**R**” that is

_{2}adjusted
also written “adjusted R-squared” or R_{2} *adj*. This quantity is also sometimes called the “adjusted coefficient of multiple determinations”. To calculate the adjusted Rsquared, it is convenient to use an *n *× *n *matrix, **Q**, with every entry equaling 1.0. This permits calculation of the “sum of squares total” (SST) using

where *k *is the number of terms in the fitted model and SSE* is the sum of squares error .It is common to interpret R_{2} *adj *as the “fraction of the variation in the response data explained by the model”.

**Example****( R _{2} Adjusted Calculations) **

**Calculate and interpret R**?

_{2}adjusted
**Answer: **The following derive from previous results and definitions:

Therefore, with *n *= 5 data points, SST = 13720 and R2 adjusted = 0.662 so that roughly 66% of the observed variation is explained by the first order model in *x*_{1}.

The phrase **“cross-validation”** refers to efforts to evaluate prediction errors by using some of the data points only for this purpose, *i.e.*, a set of data points only for testing.