Covariance

Covariance algorithm computes the following set of quantitative dataset characteristics:

  • means

  • covariance

  • correlation

Operation

Computational methods

Programming Interface

dense

dense

compute(…)

compute_input

compute_result

Mathematical formulation

Computing

Given a set \(X\) of \(n\) \(p\)-dimensional feature vectors \(x_1 = (x_{11}, \ldots, x_{1p}), \ldots, x_n = (x_{n1}, \ldots, x_{np})\), the problem is to compute the sample means or the covariance matrix or the correlation matrix:

Statistic

Definition

Means

\(M = (m(1), \ldots , m(p))\), where \(m\left(j\right)=\frac{1}{n}\sum _{i}{x}_{ij}\)

Covariance matrix

\(Cov = (v_{ij})\), where \(v_{ij}=\frac{1}{n-1}\sum_{k=1}^{n}(x_{ki}-m(i))(x_{kj}-m(j))\), \(i=\overline{1,p}\), \(j=\overline{1,p}\)

Correlation matrix

\(Cor = (c_{ij})\), where \(c_{ij}=\frac{v_{ij}}{\sqrt{v_{ii}\cdot v_{jj}}}\), \(i=\overline{1,p}\), \(j=\overline{1,p}\)

Computation method: dense

The method computes the means or the variance-covariance matrix or the correlation matrix

Programming Interface

Refer to API Reference: Covariance.

Distributed mode

The algorithm supports distributed execution in SMPD mode (only on GPU).