Data Mining & Data Warehousing

Adaboost, A Boosting Algorithm

Introduction:

Algorithm: Adaboost. A boosting algorithm—creates an ensemble of classifiers. Each one gives a weighted vote.

Input:

  • D, a set of d class-labeled training tuples;
  • k, the number of rounds (one classifier is generated per round);
  • a classification learning scheme.

Output: A composite model.

Method:

(1) initialize the weight of each tuple in D to 1=d;

(2) for i = 1 to k do // for each round:

(3) sample D with replacement according to the tuple weights to obtain Di;

(4) use training set Di to derive a model, Mi;

(5) compute error(Mi), the error rate of Mi (Equation 6.66)

(6) if error(Mi) > 0:5 then

(7) reinitialize the weights to 1=d

(8) go back to step 3 and try again;

(9) endif

(10) for each tuple in Di that was correctly classified do

(11) multiply the weight of the tuple by error(Mi)=(1-error(Mi)); // update weights

(12) normalize the weight of each tuple;

(13) endfor

To use the composite model to classify tuple, X:

(1) initialize weight of each class to 0;

(2) for i = 1 to k do // for each classifier:

(3) wi = log 1-error(Mi)/error(Mi) ; // weight of the classifier’s vote

(4) c = Mi(X); // get class prediction for X from Mi

(5) add wi to weight for class c

(6) endfor

(7) return the class with the largest weight;