Covariance¶
Covariance algorithm computes the following set of quantitative dataset characteristics:
means
covariance
correlation
Operation |
Computational methods |
Programming Interface |
||
Mathematical formulation¶
Computing¶
Given a set \(X\) of \(n\) \(p\)-dimensional feature vectors \(x_1 = (x_{11}, \ldots, x_{1p}), \ldots, x_n = (x_{n1}, \ldots, x_{np})\), the problem is to compute the sample means or the covariance matrix or the correlation matrix:
Statistic |
Definition |
---|---|
Means |
\(M = (m(1), \ldots , m(p))\), where \(m\left(j\right)=\frac{1}{n}\sum _{i}{x}_{ij}\) |
Covariance matrix |
\(Cov = (v_{ij})\), where \(v_{ij}=\frac{1}{n-1}\sum_{k=1}^{n}(x_{ki}-m(i))(x_{kj}-m(j))\), \(i=\overline{1,p}\), \(j=\overline{1,p}\) |
Correlation matrix |
\(Cor = (c_{ij})\), where \(c_{ij}=\frac{v_{ij}}{\sqrt{v_{ii}\cdot v_{jj}}}\), \(i=\overline{1,p}\), \(j=\overline{1,p}\) |
Computation method: dense¶
The method computes the means or the variance-covariance matrix or the correlation matrix
Programming Interface¶
Refer to API Reference: Covariance.
Distributed mode¶
The algorithm supports distributed execution in SMPD mode (only on GPU).