LogitBoost Classifier

LogitBoost is a boosting classification algorithm. LogitBoost and AdaBoost are close to each other in the sense that both perform an additive logistic regression. The difference is that AdaBoost minimizes the exponential loss, whereas LogitBoost minimizes the logistic loss.

LogitBoost within oneDAL implements a multi-class classifier.

Details

Given \(n\) feature vectors \(x_1 = (x_{11}, \ldots, x_{1p}), \ldots, x_n = (x_{n1}, \ldots, x_{np})\) of size \(p\) and a vector of class labels \(y= (y_1, \ldots, y_n)\), where \(y_i \in K = \{0, \ldots, J-1\}\) describes the class to which the feature vector \(x_i\) belongs and \(J\) is the number of classes, the problem is to build a multi-class LogitBoost classifier.

Training Stage

The LogitBoost model is trained using the Friedman method [Friedman00].

Let \(y_{i,j} = I \{x_i \in j\}\) is the indicator that the \(i\)-th feature vector belongs to class \(j\). The scheme below, which uses the stump weak learner, shows the major steps of the algorithm:

  1. Start with weights \(w_{ij} = \frac{1}{n}\), \(F_j(x) = 0\), \(p_j(x) = \frac {1}{J}\), \(i = 1, \ldots, n\), \(j = 0, \ldots, J-1\).

  2. For \(m = 1, \ldots, M\):

    Do

    For \(j = 1, \ldots, J\)

    Do

    1. Compute working responses and weights in the j-th class:

      \[w_{ij} = p_i(x_i) (1 - p_i (x_i)), w_{ij} = max(z_{ij},\text{Thr1})\]
      \[z_{ij} = \frac {(y_{ij} - p_i(x_i))} {w_{ij}}, z_{ij} = \min(\max(z_{ij},-\text{Thr2}), \text{Thr2})\]
    2. Fit the function \(f_{mj}(x)\) by a weighted least-squares regression of \(z_{ij}\) to \(x_i\) with weights \(w_{ij}\) using the stump-based approach.

    End do

    \(f_{mj}(x) = \frac {J-1}{J} (f_{mj}(x) - \frac{1}{J} \sum _{k=1}^{J} f_{mk}(x))\)

    \(F_j(x) = F_j(x) + f_{mj}(x)\)

    \(p_j(x) = \frac {e^{F_j(x)}}{\sum _{k=1}^{J} e^{F_k(x)}}\)

    End do

The result of the model training is a set of \(M\) stumps.

Prediction Stage

Given the LogitBoost classifier and \(r\) feature vectors \(x_1, \ldots, x_r\), the problem is to calculate the labels \(\underset{j}{\mathrm{argmax}} F_j(x)\) of the classes to which the feature vectors belong.

Batch Processing

LogitBoost classifier follows the general workflow described in Classification Usage Model.

Training

For a description of the input and output, refer to Classification Usage Model.

At the training stage, a LogitBoost classifier has the following parameters:

Training Parameters for LogitBoost Classifier (Batch Processing)

Parameter

Default Value

Description

algorithmFPType

float

The floating-point type that the algorithm uses for intermediate computations. Can be float or double.

method

defaultDense

The computation method used by the LogitBoost classifier. The only training method supported so far is the Friedman method.

weakLearnerTraining

DEPRECATED: Pointer to an object of the stump training class.

USE INSTEAD: Pointer to an object of the regression stump training class.

DEPRECATED: Pointer to the training algorithm of the weak learner. By default, a stump weak learner is used.

USE INSTEAD: Pointer to the regression training algorithm. By default, a regression stump with mse split criterion is used.

weakLearnerPrediction

DEPRECATED: Pointer to an object of the stump prediction class.

USE INSTEAD: Pointer to an object of the regression stump prediction class.

DEPRECATED: Pointer to the prediction algorithm of the weak learner. By default, a stump weak learner is used.

USE INSTEAD: Pointer to the regression prediction algorithm. By default, a regression stump with mse split criterion is used.

accuracyThreshold

\(0.01\)

LogitBoost training accuracy.

maxIterations

\(100\)

The maximal number of iterations for the LogitBoost algorithm.

nClasses

Not applicable

The number of classes, a required parameter.

weightsDegenerateCasesThreshold

\(1\mathrm{e}-10\)

The threshold to avoid degenerate cases when calculating weights \(w_{ij}\).

responsesDegenerateCasesThreshold

\(1\mathrm{e}-10\)

The threshold to avoid degenerate cases when calculating responses \(z_{ij}\).

Prediction

For a description of the input and output, refer to Classification Usage Model.

At the prediction stage, a LogitBoost classifier has the following parameters:

Prediction Parameters for LogitBoost Classifier (Batch Processing)

Parameter

Default Value

Description

algorithmFPType

float

The floating-point type that the algorithm uses for intermediate computations. Can be float or double.

method

defaultDense

Performance-oriented computation method, the only method supported by the LogitBoost classifier at the prediction stage.

weakLearnerPrediction

DEPRECATED: Pointer to an object of the stump prediction class.

USE INSTEAD: Pointer to an object of the regression stump prediction class.

DEPRECATED: Pointer to the prediction algorithm of the weak learner. By default, a stump weak learner is used.

USE INSTEAD: Pointer to the regression prediction algorithm. By default, a regression stump with mse split criterion is used.

nClasses

Not applicable

The number of classes, a required parameter.

Note

The algorithm terminates if it achieves the specified accuracy or reaches the specified maximal number of iterations. To determine the actual number of iterations performed, call the getNumberOfWeakLearners() method of the LogitBoostModel class and divide it by nClasses.

Examples

Batch Processing: