←
Six Sigma
Normal Probability Plots
Categorical and Mixture Factors (Optional):
- “Categorical factors” are inputs that can assume only a finite number of levels and the ordering of these levels is ambiguous. For example, a categorical factor might be the supplier company that makes the component in question, which could be Intel, Panasonic, or RCA (three levels).
- Categorical factors are distinguished from continuous factors, which can assume, theoretically, any of an infinite number of levels which have a natural ordering.
- “Mixture factors” are inputs whose levels are constrained to sum to a constant. For example, these could be the components of a cake such as percent flour, water, and sugar. Percentages must total 100%.
Regression with Categorical Factors:
- In general, categorical variables should be avoided as far as possible because their inclusion can greatly increase the number of terms in a model.
- A general rule is that the number of data or runs needed to fit accurately a model is proportional to the number of terms. Often, engineering insight can permit the experimental team to address the same issue in planning experiments using either a continuous or a categorical factor.
- For example, color might be considered a categorical factor (e.g., levels might be “green” and “yellow”). At the same time, with suitable equipment it might be possible to address color issues by varying the wavelength of the light, e.g., using a prism. Then wavelength could be the experimental factor resulting in either a savings in experimentation costs or an increase in prediction accuracy or both.
- Note that some factors are not categorical even if one can only reliably create certain levels of them. For example, imagine that only the temperatures of 20 °C, 25 °C, and 100 °C are available in the laboratory because of experimental limitations.
- In this case, temperature is continuous and not categorical, since one
- might be interested in performance at 78 °C (i.e., all “in between” levels are conceivably possible). Also, 20 °C < 25 °C < 100 °C so the level ordering is not ambiguous.
- Generally, if categorical factors are at two levels, regression models based on categorical factors can be constructed in the same manner as for continuous factors. However, if three or more levels of one or more categorical factors are involved, the situation is relatively complicated. Then, a mathematical construct called “contrasts” are created and treated like “mini-factors” in the analysis. If there are l levels of the categorical factor, then one creates l – 1 two-level contrasts.
- These contrasts function in a similar manner in calculations as factors for which experimentation has been conducted at two levels.
- Therefore, interaction terms can be fitted but pure quadratic or cubic terms cannot.
- There are multiple approaches for creating these contrasts that give the same predictions in all situations.
- The approach described here is to create the ith contrast with values equal to 1 if the categorical factor assumes the corresponding ith level,
- for i = 1,…, l – 1. Then, in the modeling, no terms involving interactions between these contrasts can be included, although interactions between contrasts and continuous factors can be included.