Six Sigma

Fitting Logit Models

Introduction:

Logit models are probably the most widely used logistic regression and discrete choice models, partly because the associated extreme value distribution makes logit models easy to work with mathematically.

 

Fitting Logit Models:

The following notation is used in fitting logit models:

1.c is the number of choice sets.

2.xj,s is an m vector of factor levels (attributes) of response level (alternative system) j in choice set s. In ordinary logistic regression cases, there is only one choice set (c = 1), and j is similar to the usual run index in an experimental design array.

3.ms are the number of response levels (alternative systems) in choice set s.

4.ns are the number of observations of selections from choice set s.

5.βest,j is the estimated coefficient reflecting the average utility of the response level j as a function of the factor levels. Here, the focus is on the assumption that βest,j = βest for all j.

6.fj(x) is the functional form of the response j (alternative system) model. Here, the focus is on the assumption that fj(x) = f(x) for all j.

7.pj,s(xest) is the probability that the response j with attributes (x) will be selected in the set s.

8.yj,s denotes the number of selections of the alternative j in the choice set s.

  • ln L(βest) is the log-likelihood which is the fitting objective.
  • This property is that a change in the attributes, x, of one alternative j necessarily results in a change in all other choice probabilities, exactly preserving their relative magnitudes. This property is generally considered not desirable and motivates alternative to logit based logistic regression models.
  • Note also, some of the attributes associated with specific choices in choice sets could be associated with the decision-makers, e.g., their incomes. This might not require changes to the above formulas as illustrated by the next example. In general, many variations of the above approach are considered in the literation with complications depending on relevant assumptions and the input pattern or design of experiments array.