AdaBoost Classifier

AdaBoost (short for “Adaptive Boosting”) is a popular boosting classification algorithm. AdaBoost algorithm performs well on a variety of data sets except some noisy data [Freund99].

AdaBoost is a binary classifier. For a multi-class case, use Multi-class Classifier framework of the library.

Details

Given \(n\) feature vectors \(x_1 = (x_{11}, \ldots, x_{1p}), \ldots, x_n = (x_{n1}, \ldots, x_{np})\) of size \(p\) and a vector of class labels \(y= (y_1, \ldots, y_n)\), where \(y_i \in K = \{-1, 1\}\) describes the class to which the feature vector \(x_i\) belongs, and a weak learner algorithm, the problem is to build an AdaBoost classifier.

Training Stage

The following scheme shows the major steps of the algorithm:

  1. Initialize weights \(D_1(i) = \frac{1}{n}\) for \(i = 1, \ldots, n\).

  2. For \(t = 1, \ldots, T\):

    1. Train the weak learner \(h_t(t) \in \{-1, 1\}\) using weights \(D_t.\)

    2. Choose a confidence value \(\alpha_t\).

    3. Update \(D_{t+1}(i) = \frac {D_t(i)\exp(-\alpha_t Y_i h_t(x_i))} {Z_t}\), where \(Z_t\) is a normalization factor.

  3. Output the final hypothesis:

    \[H(x_i) = \mathrm{sign} \left( \sum _{t=1}^{T} \alpha_t h_t(x_i)\right)\]

Prediction Stage

Given the AdaBoost classifier and \(r\) feature vectors \(x_1, \ldots, x_r\), the problem is to calculate the final class:

\[H(x_i) = \mathrm{sign} \left( \sum _{t=1}^{T} \alpha_t h_t(x_i)\right)\]

Batch Processing

AdaBoost classifier follows the general workflow described in Classification Usage Model.

Training

For a description of the input and output, refer to Classification Usage Model.

At the training stage, an AdaBoost classifier has the following parameters:

Training Parameters for AdaBoost Classifier (Batch Processing)

Parameter

Default Value

Description

algorithmFPType

float

The floating-point type that the algorithm uses for intermediate computations. Can be float or double.

method

defaultDense

The computation method used by the AdaBoost classifier. The only training method supported so far is the Y. Freund’s method.

weakLearnerTraining

Pointer to an object of the stump training class

Pointer to the training algorithm of the weak learner. By default, a stump weak learner is used.

weakLearnerPrediction

Pointer to an object of the stump prediction class

Pointer to the prediction algorithm of the weak learner. By default, a stump weak learner is used.

accuracyThreshold

\(0.01\)

AdaBoost training accuracy.

maxIterations

\(100\)

The maximal number of iterations for the algorithm.

Prediction

For a description of the input and output, refer to Classification Usage Model.

At the prediction stage, an AdaBoost classifier has the following parameters:

Prediction Parameters for AdaBoost Classifier (Batch Processing)

Parameter

Default Value

Description

algorithmFPType

float

The floating-point type that the algorithm uses for intermediate computations. Can be float or double.

method

defaultDense

Performance-oriented computation method, the only method supported by the AdaBoost classifier at the prediction stage.

weakLearnerPrediction

Pointer to an object of the stump prediction class

Pointer to the prediction algorithm of the weak learner. By default, a stump weak learner is used.

Examples

Batch Processing: