BrownBoost Classifier

BrownBoost is a boosting classification algorithm. It is more robust to noisy data sets than other boosting classification algorithms [Freund99].

BrownBoost is a binary classifier. For a multi-class case, use Multi-class Classifier framework of the library.

Details

Given \(n\) feature vectors \(x_1 = (x_{11}, \ldots, x_{1p}), \ldots, x_n = (x_{n1}, \ldots, x_{np})\) of size \(p\) and a vector of class labels \(y= (y_1, \ldots, y_n)\), where \(y_i \in K = \{-1, 1\}\) describes the class to which the feature vector \(x_i\) belongs, and a weak learner algorithm, the problem is to build a two-class BrownBoost classifier.

Training Stage

The model is trained using the Freund method [Freund01] as follows:

  1. Calculate \(c = \mathrm{erfinv}^2(1 - \varepsilon)\), where:

    • \(\mathrm{erfinv}(x)\) is an inverse error function,

    • \(\varepsilon\) is a target classification error of the algorithm defined as \(\frac {1}{n} \sum _{i=1}^{n} |p(x_i) - y_i|\)

    • \(p(x) = \text{erf} \left(\frac {\sum _{i=1}^{M} \alpha_i h_i(x)}{\sqrt{c}}\right)\)

    • \(\mathrm{erf}(x)\) is the error function,

    • \(h_i(x)\) is a hypothesis formulated by the \(i\)-th weak learner, \(i = 1, \ldots, M\),

    • \(\alpha_i\) is the weight of the hypothesis.

  2. Set initial prediction values: \(r_1(x, y) = 0\).

  3. Set “remaining timing”: \(s_1 = c\).

  4. Do for \(i=1, 2, \ldots\) until \(s_{i+1} \leq 0\)

    1. With each feature vector and its label of positive weight, associate \(W_i(x, y) = e^{\frac {-(r_i(x, y) + s_i)^2}{c}}\).

    2. Call the weak learner with the distribution defined by normalizing Lmath:W_i(x, y) to receive a hypothesis \(h_i(x)\).

    3. Solve the differential equation

      \[\frac {dt}{d\alpha} = \gamma = \frac {\sum _{(x,y)} \exp (-\frac{1}{c} (r_i(x, y) + \alpha h_i(x) y + s_i - t)^2)h_i(x)y} {\sum _{(x,y)} \exp (-\frac{1}{c} (r_i(x, y) + \alpha h_i(x) y + s_i - t)^2)}\]

      with given boundary conditions \(t = 0\) and \(\alpha = 0\) to find \(t_i = t^{*} > 0\) and \(\alpha_i = \alpha^{*}\) such that either \(\gamma \leq ν\) or \(t^{*} = s_i\), where \(ν\) is a given small constant needed to avoid degenerate cases.

    4. Update the prediction values: \(r_{i+1}(x, y) = r_i(x, y) + \alpha_i h_i(x) y\).

    5. Update “remaining time”: \(s_{i+1} = s_i - t_i\).

    End do

The result of the model training is the array of \(M\) weak learners \(h_i\).

Prediction Stage

Given the BrownBoost classifier and \(r\) feature vectors \(x_1, \ldots, x_r\), the problem is to calculate the final classification confidence, a number from the interval \([-1, 1]\), using the rule:

\[p(x) = \text{erf} \left(\frac {\sum _{i=1}^{M} \alpha_i h_i (x)}{\sqrt{c}}\right)\]

Batch Processing

BrownBoost classifier follows the general workflow described in Classification Usage Model.

Training

For a description of the input and output, refer to Classification Usage Model.

At the training stage, a BrownBoost classifier has the following parameters:

Training Parameters for BrownBoost Classifier (Batch Processing)

Parameter

Default Value

Description

algorithmFPType

float

The floating-point type that the algorithm uses for intermediate computations. Can be float or double.

method

defaultDense

The computation method used by the BrownBoost classifier. The only training method supported so far is the Y. Freund’s method.

nClasses

\(2\)

The number of classes.

weakLearnerTraining

DEPRECATED: Pointer to an object of the weak learner training class

USE INSTEAD: Pointer to an object of the classification stump training class

DEPRECATED: Pointer to the training algorithm of the weak learner. By default, a stump weak learner is used.

USE INSTEAD: Pointer to the classifier training algorithm. Be default, a classification stump with gini split criterion is used.

weakLearnerPrediction

DEPRECATED: Pointer to an object of the weak learner prediction class

USE INSTEAD: Pointer to an object of the classification stump prediction class

DEPRECATED: Pointer to the prediction algorithm of the weak learner. By default, a stump weak learner is used.

USE INSTEAD: Pointer to the classifier prediction algorithm. Be default, a classification stump with gini split criterion is used.

accuracyThreshold

\(0.01\)

BrownBoost training accuracy \(\varepsilon\).

maxIterations

\(100\)

The maximal number of iterations for the BrownBoost algorithm.

newtonRaphsonAccuracyThreshold

\(1.0\mathrm{e}-3\)

Accuracy threshold of the Newton-Raphson method used underneath the BrownBoost algorithm.

newtonRaphsonMaxIterations

\(100\)

The maximal number of Newton-Raphson iterations in the algorithm.

degenerateCasesThreshold

\(1.0\mathrm{e}-2\)

The threshold used to avoid degenerate cases.

Prediction

For a description of the input and output, refer to Classification Usage Model.

At the prediction stage, a BrownBoost classifier has the following parameters:

Prediction Parameters for BrownBoost Classifier (Batch Processing)

Parameter

Default Value

Description

algorithmFPType

float

The floating-point type that the algorithm uses for intermediate computations. Can be float or double.

method

defaultDense

Performance-oriented computation method, the only method supported by the BrownBoost classifier.

nClasses

\(2\)

The number of classes.

weakLearnerPrediction

DEPRECATED: Pointer to an object of the weak learner prediction class

USE INSTEAD: Pointer to an object of the classification stump prediction class

DEPRECATED: Pointer to the prediction algorithm of the weak learner. By default, a stump weak learner is used.

USE INSTEAD: Pointer to the classifier prediction algorithm. Be default, a classification stump with gini split criterion is used.

accuracyThreshold

\(0.01\)

BrownBoost training accuracy \(\varepsilon\).

Examples

Batch Processing: