BrownBoost Classifier¶
BrownBoost is a boosting classification algorithm. It is more robust to noisy data sets than other boosting classification algorithms [Freund99].
BrownBoost is a binary classifier. For a multi-class case, use Multi-class Classifier framework of the library.
Details¶
Given \(n\) feature vectors \(x_1 = (x_{11}, \ldots, x_{1p}), \ldots, x_n = (x_{n1}, \ldots, x_{np})\) of size \(p\) and a vector of class labels \(y= (y_1, \ldots, y_n)\), where \(y_i \in K = \{-1, 1\}\) describes the class to which the feature vector \(x_i\) belongs, and a weak learner algorithm, the problem is to build a two-class BrownBoost classifier.
Training Stage¶
The model is trained using the Freund method [Freund01] as follows:
Calculate \(c = \mathrm{erfinv}^2(1 - \varepsilon)\), where:
\(\mathrm{erfinv}(x)\) is an inverse error function,
\(\varepsilon\) is a target classification error of the algorithm defined as \(\frac {1}{n} \sum _{i=1}^{n} |p(x_i) - y_i|\)
\(p(x) = \text{erf} \left(\frac {\sum _{i=1}^{M} \alpha_i h_i(x)}{\sqrt{c}}\right)\)
\(\mathrm{erf}(x)\) is the error function,
\(h_i(x)\) is a hypothesis formulated by the \(i\)-th weak learner, \(i = 1, \ldots, M\),
\(\alpha_i\) is the weight of the hypothesis.
Set initial prediction values: \(r_1(x, y) = 0\).
Set “remaining timing”: \(s_1 = c\).
Do for \(i=1, 2, \ldots\) until \(s_{i+1} \leq 0\)
With each feature vector and its label of positive weight, associate \(W_i(x, y) = e^{\frac {-(r_i(x, y) + s_i)^2}{c}}\).
Call the weak learner with the distribution defined by normalizing Lmath:W_i(x, y) to receive a hypothesis \(h_i(x)\).
Solve the differential equation
\[\frac {dt}{d\alpha} = \gamma = \frac {\sum _{(x,y)} \exp (-\frac{1}{c} (r_i(x, y) + \alpha h_i(x) y + s_i - t)^2)h_i(x)y} {\sum _{(x,y)} \exp (-\frac{1}{c} (r_i(x, y) + \alpha h_i(x) y + s_i - t)^2)}\]with given boundary conditions \(t = 0\) and \(\alpha = 0\) to find \(t_i = t^{*} > 0\) and \(\alpha_i = \alpha^{*}\) such that either \(\gamma \leq ν\) or \(t^{*} = s_i\), where \(ν\) is a given small constant needed to avoid degenerate cases.
Update the prediction values: \(r_{i+1}(x, y) = r_i(x, y) + \alpha_i h_i(x) y\).
Update “remaining time”: \(s_{i+1} = s_i - t_i\).
End do
The result of the model training is the array of \(M\) weak learners \(h_i\).
Prediction Stage¶
Given the BrownBoost classifier and \(r\) feature vectors \(x_1, \ldots, x_r\), the problem is to calculate the final classification confidence, a number from the interval \([-1, 1]\), using the rule:
Batch Processing¶
BrownBoost classifier follows the general workflow described in Classification Usage Model.
Training¶
For a description of the input and output, refer to Classification Usage Model.
At the training stage, a BrownBoost classifier has the following parameters:
Parameter |
Default Value |
Description |
---|---|---|
|
|
The floating-point type that the algorithm uses for intermediate computations. Can be |
|
|
The computation method used by the BrownBoost classifier. The only training method supported so far is the Y. Freund’s method. |
|
\(2\) |
The number of classes. |
|
DEPRECATED: Pointer to an object of the weak learner training class USE INSTEAD: Pointer to an object of the classification stump training class |
DEPRECATED: Pointer to the training algorithm of the weak learner. By default, a stump weak learner is used. USE INSTEAD: Pointer to the classifier training algorithm. Be default, a classification stump with gini split criterion is used. |
|
DEPRECATED: Pointer to an object of the weak learner prediction class USE INSTEAD: Pointer to an object of the classification stump prediction class |
DEPRECATED: Pointer to the prediction algorithm of the weak learner. By default, a stump weak learner is used. USE INSTEAD: Pointer to the classifier prediction algorithm. Be default, a classification stump with gini split criterion is used. |
|
\(0.01\) |
BrownBoost training accuracy \(\varepsilon\). |
|
\(100\) |
The maximal number of iterations for the BrownBoost algorithm. |
|
\(1.0\mathrm{e}-3\) |
Accuracy threshold of the Newton-Raphson method used underneath the BrownBoost algorithm. |
|
\(100\) |
The maximal number of Newton-Raphson iterations in the algorithm. |
|
\(1.0\mathrm{e}-2\) |
The threshold used to avoid degenerate cases. |
Prediction¶
For a description of the input and output, refer to Classification Usage Model.
At the prediction stage, a BrownBoost classifier has the following parameters:
Parameter |
Default Value |
Description |
---|---|---|
|
|
The floating-point type that the algorithm uses for intermediate computations. Can be |
|
|
Performance-oriented computation method, the only method supported by the BrownBoost classifier. |
|
\(2\) |
The number of classes. |
|
DEPRECATED: Pointer to an object of the weak learner prediction class USE INSTEAD: Pointer to an object of the classification stump prediction class |
DEPRECATED: Pointer to the prediction algorithm of the weak learner. By default, a stump weak learner is used. USE INSTEAD: Pointer to the classifier prediction algorithm. Be default, a classification stump with gini split criterion is used. |
|
\(0.01\) |
BrownBoost training accuracy \(\varepsilon\). |
Examples¶
Batch Processing:
Batch Processing: