.. ****************************************************************************** .. * Copyright 2021 Intel Corporation .. * .. * Licensed under the Apache License, Version 2.0 (the "License"); .. * you may not use this file except in compliance with the License. .. * You may obtain a copy of the License at .. * .. * http://www.apache.org/licenses/LICENSE-2.0 .. * .. * Unless required by applicable law or agreed to in writing, software .. * distributed under the License is distributed on an "AS IS" BASIS, .. * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. .. * See the License for the specific language governing permissions and .. * limitations under the License. .. *******************************************************************************/ .. default-domain:: cpp .. _alg_basic_statistics: ================ Basic Statistics ================ .. include:: ../../../includes/statistics/basic-statistics-introduction.rst ------------------------ Mathematical formulation ------------------------ .. _basic_statistics_c_math: Computing --------- Given a set :math:`X` of :math:`n` :math:`p`-dimensional feature vectors :math:`x_1 = (x_{11}, \ldots, x_{1p}), \ldots, x_n = (x_{n1}, \ldots, x_{np})`, the problem is to compute the following sample characteristics for each feature in the data set: .. list-table:: :widths: 20 60 :header-rows: 1 :align: left * - Statistic - Definition * - Minimum - :math:`min(j) = \smash{\displaystyle \min_i } \{x_{ij}\}` * - Maximum - :math:`max(j) = \smash{\displaystyle \max_i } \{x_{ij}\}` * - Sum - :math:`s(j) = \sum_i x_{ij}` * - Sum of squares - :math:`s_2(j) = \sum_i x_{ij}^2` * - Means - :math:`m(j) = \frac {s(j)} {n}` * - Second order raw moment - :math:`a_2(j) = \frac {s_2(j)} {n}` * - Sum of squared difference from the means - :math:`\text{SDM}(j) = \sum_i (x_{ij} - m(j))^2` * - Variance - :math:`k_2(j) = \frac {\text{SDM}(j) } {n - 1}` * - Standard deviation - :math:`\text{stdev}(j) = \sqrt {k_2(j)}` * - Variation coefficient - :math:`V(j) = \frac {\text{stdev}(j)} {m(j)}` .. _basic_statistics_c_math_dense: Computation method: *dense* --------------------------- The method computes the basic statistics for each feature in the data set. --------------------- Programming Interface --------------------- Refer to :ref:`API Reference: Basic statistics `. ---------------- Distributed mode ---------------- The algorithm supports distributed execution in SMPD mode (only on GPU).